Apparmor profile caching

zyga-snapd · July 10, 2017, 12:49pm

Hey

I’m investigating the odd errors that some people reported about snap-confine being unable to mkdir /run/snapd/ns directory. While we had many theories what may the problem be (and one of those turned out to be true, when snapd runs under LXD with apparmor stacking/confinement disabled) there seems to be more going on still:

I patched my apparmor installation to trace all calls to apparmor_parser and discovered this log after a reboot:

2017-07-10 14:29:03+02:00  apparmor_parser --write-cache --replace --cache-loc=/etc/apparmor.d/cache -- /etc/apparmor.d/lxc-containers /etc/apparmor.d/sbin.dhclient /etc/apparmor.d/snap.core.2312.usr.lib.snapd.snap-confine /etc/apparmor.d/snap.core.2329.usr.lib.snapd.snap-confine /etc/apparmor.d/snap.core.2361.usr.lib.snapd.snap-confine /etc/apparmor.d/usr.bin.evince /etc/apparmor.d/usr.bin.firefox /etc/apparmor.d/usr.bin.lxc-start /etc/apparmor.d/usr.lib.snapd.snap-confine.real /etc/apparmor.d/usr.sbin.cups-browsed /etc/apparmor.d/usr.sbin.cupsd /etc/apparmor.d/usr.sbin.ippusbxd /etc/apparmor.d/usr.sbin.rsyslogd /etc/apparmor.d/usr.sbin.tcpdump
2017-07-10 14:29:04+02:00  apparmor_parser --write-cache --replace --cache-loc=/var/cache/apparmor -O no-expr-simplify -- /var/lib/snapd/apparmor/profiles/snap.atom.atom /var/lib/snapd/apparmor/profiles/snap.canonical-livepatch.canonical-livepatch /var/lib/snapd/apparmor/profiles/snap.canonical-livepatch.canonical-livepatchd /var/lib/snapd/apparmor/profiles/snap.core.hook.configure /var/lib/snapd/apparmor/profiles/snap.hello-world.env /var/lib/snapd/apparmor/profiles/snap.hello-world.evil /var/lib/snapd/apparmor/profiles/snap.hello-world.hello-world /var/lib/snapd/apparmor/profiles/snap.hello-world.sh /var/lib/snapd/apparmor/profiles/snap.http.http /var/lib/snapd/apparmor/profiles/snap.http.man
2017-07-10 14:29:10+02:00  apparmor_parser --replace --write-cache -O no-expr-simplify --cache-loc=/var/cache/apparmor /var/lib/snapd/apparmor/profiles/snap.atom.atom
2017-07-10 14:29:11+02:00  apparmor_parser --replace --write-cache -O no-expr-simplify --cache-loc=/var/cache/apparmor /var/lib/snapd/apparmor/profiles/snap.canonical-livepatch.canonical-livepatch
2017-07-10 14:29:11+02:00  apparmor_parser --replace --write-cache -O no-expr-simplify --cache-loc=/var/cache/apparmor /var/lib/snapd/apparmor/profiles/snap.canonical-livepatch.canonical-livepatchd
2017-07-10 14:29:15+02:00  apparmor_parser --replace --write-cache /etc/apparmor.d/snap.core.2312.usr.lib.snapd.snap-confine --cache-loc /etc/apparmor.d/cache
2017-07-10 14:29:15+02:00  apparmor_parser --replace --write-cache -O no-expr-simplify --cache-loc=/var/cache/apparmor /var/lib/snapd/apparmor/profiles/snap.core.hook.configure
2017-07-10 14:29:16+02:00  apparmor_parser --replace --write-cache -O no-expr-simplify --cache-loc=/var/cache/apparmor /var/lib/snapd/apparmor/profiles/snap.hello-world.env
2017-07-10 14:29:16+02:00  apparmor_parser --replace --write-cache -O no-expr-simplify --cache-loc=/var/cache/apparmor /var/lib/snapd/apparmor/profiles/snap.hello-world.evil
2017-07-10 14:29:16+02:00  apparmor_parser --replace --write-cache -O no-expr-simplify --cache-loc=/var/cache/apparmor /var/lib/snapd/apparmor/profiles/snap.hello-world.hello-world
2017-07-10 14:29:16+02:00  apparmor_parser --replace --write-cache -O no-expr-simplify --cache-loc=/var/cache/apparmor /var/lib/snapd/apparmor/profiles/snap.hello-world.sh
2017-07-10 14:29:18+02:00  apparmor_parser --replace --write-cache -O no-expr-simplify --cache-loc=/var/cache/apparmor /var/lib/snapd/apparmor/profiles/snap.http.http
2017-07-10 14:29:18+02:00  apparmor_parser --replace --write-cache -O no-expr-simplify --cache-loc=/var/cache/apparmor /var/lib/snapd/apparmor/profiles/snap.http.man

Among this log you will find two entries interesting: --cache-loc=/etc/apparmor.d/cache and --cache-loc=/var/cache/apparmor

So it seems that we use two locations for the cache, depending on who’s calling. Quick inspection of those directories shos that there is no intersection (for whatever reason)

zyga@kaer-morhen:/var/cache/apparmor$ ll /etc/apparmor.d/cache/
razem 1,9M
-rw------- 1 root root 4,1K cze 25 00:14 content-hub-clipboard
-rw------- 1 root root 4,1K cze 25 00:14 content-hub-peer-picker
-rw------- 1 root root  92K cze 25 00:14 lightdm-guest-session
-rw------- 1 root root 119K cze 25 00:14 lxc-containers
-rw------- 1 root root 114K cze 25 00:14 sbin.dhclient
-rw------- 1 root root  53K lip 10 14:29 snap.core.2312.usr.lib.snapd.snap-confine
-rw------- 1 root root  53K lip 10 14:00 snap.core.2329.usr.lib.snapd.snap-confine
-rw------- 1 root root  54K lip 10 13:51 snap.core.2361.usr.lib.snapd.snap-confine
-rw------- 1 root root 4,1K cze 25 00:14 url-dispatcher-bad-url-helper
-rw------- 1 root root 678K cze 25 00:14 usr.bin.evince
-rw------- 1 root root 7,2K cze 25 00:14 usr.bin.lxc-start
-rw------- 1 root root 201K cze 25 00:14 usr.bin.webbrowser-app
-rw------- 1 root root  51K lip  5 07:48 usr.lib.snapd.snap-confine.real
-rw------- 1 root root 221K cze 25 00:14 usr.lib.telepathy
-rw------- 1 root root  23K cze 25 00:14 usr.lib.ubuntu-download-manager.udm-extractor
-rw------- 1 root root  35K cze 25 00:14 usr.sbin.cups-browsed
-rw------- 1 root root  85K cze 25 00:14 usr.sbin.cupsd
-rw------- 1 root root  21K cze 25 00:14 usr.sbin.ippusbxd
-rw------- 1 root root  58K cze 25 00:14 usr.sbin.tcpdump
zyga@kaer-morhen:/var/cache/apparmor$ ll /var/cache/apparmor/
razem 652K
-rw------- 1 root root 4,3K cze 25 00:14 snap.atom.atom
-rw------- 1 root root  87K lip 10 13:56 snap.canonical-livepatch.canonical-livepatch
-rw------- 1 root root  87K lip 10 13:56 snap.canonical-livepatch.canonical-livepatchd
-rw------- 1 root root  75K lip 10 14:12 snap.core.hook.configure
-rw------- 1 root root  61K lip 10 12:46 snap.hello-world.env
-rw------- 1 root root  61K lip 10 12:46 snap.hello-world.evil
-rw------- 1 root root  61K lip 10 12:46 snap.hello-world.hello-world
-rw------- 1 root root  61K lip 10 12:46 snap.hello-world.sh
-rw------- 1 root root  74K lip  5 14:13 snap.http.http
-rw------- 1 root root  59K lip  5 14:13 snap.http.man

It does though begs for the question, why? @jdstrand can you shed some light on this? Should we unify under one specific directory? EDIT: and one more question, how is that cache pruned of stale entires?

jdstrand · July 10, 2017, 1:44pm

/etc/apparmor.d/cache should be considered owned by apparmor. /etc/apparmor.d is the location for so called ‘system profiles’ where users and applications can put profiles that apparmor will manage (ie, apparmor init will write out cache files for anything in /etc/apparmor.d in /etc/apparmor.d/cache).

Some applications (eg libvirt, snappy, etc) choose to manage their own profiles, so they store their profiles somewhere else (eg, /var/lib/snapd/apparmor/profiles on snappy).

Ubuntu’s apparmor init is patched to understand /var/lib/snapd/apparmor/profiles and will load cache files from /var/cache/apparmor for those profiles (this concept dates back to Touch and snappy v1 (but they of course used different profile directories)). Profile caches for snappy end up there and apparmor init makes sure these are loaded on early boot.

With all of the recent changes with snap-reexec and deterministic profiles loads, it may make sense to reconsider /var/cache/apparmor for snappy’s cache dir and let it manage its own cache dir itself (eg, /var/lib/snapd/apparmor/cache). The main points to consider are:

cache files appreciably speed up profile load times because the profile doesn’t have to be compiled (especially on armhf)
the profile must be loaded into the kernel before the application starts (eg snap-confine or snap commands). If not loaded into the kernel first, in the case of snap-confine, because the profile uses binary attachment it will run unconfined. In the case of snap commands, they will fail to start because snap-confine won’t be able to run it due to profile not found.

Notice in your trace that the first two lines (ie, where --write-cache is listed as the first argument) are coming from apparmor boot init where the others are snapd.

jdstrand · July 10, 2017, 1:46pm

Note this will be problematic for non-Debian/non-Ubuntu systems that (will) use apparmor (eg, openSUSE).

niemeyer · July 10, 2017, 7:42pm

That sounds like a pretty unfortunate choice.

Nice, at least we’re using a more reasonable place for them.

Now, we need to find out why on certain occasions snap-confine is getting the wrong profile loaded. That one profile does end up in /etc/apparmor.d because it’s provided by the package.

jdstrand · July 10, 2017, 7:54pm

Perhaps, but it is a very old decision that makes things convenient for distros and administrators who just want to plop a file in /etc/apparmor.d and then have the system DTRT (ie, load it on boot, enable caching, etc). Applications who have their own ideas about profile loads are under no obligation to use /etc/apparmor.d.

This was initially the correct choice for ubuntu-core-launcher and early versions of snap-confine since it was installed as part of the system and there was only one profile on disk and the apparmor init boot process would figure out when to invalidate or use the cache easily enough. With all the re-exec work though, it may make sense to move snap-confine’s profile somewhere else and have snapd (or something else) load it. This could ensure that the correct profile is loaded and would avoid any distro-specific issues regarding when the apparmor initscript/systemd unit relative to snapd (distros should have apparmor load extremely early of course), but comes with its own set of challenges of making sure it is loaded before anything runs snap-confine.

niemeyer · July 10, 2017, 8:04pm

Sorry for not being clear. It’s the caching in that place that sounds awkward. /etc is not a place to write down content that is not intended to be manipulated by people. Sounds fine to have /etc/apparmor.d itself as a place for security profiles.

Either way, you are right, that’s not a problem we need to solve in this context.

snap-confine is the first thing that executes in many cases. We can’t have snapd itself loading the profile for it on boot, and it seems to make sense that this is done by something from the system itself as it’s a general problem if one intends to use apparmor at all.

jdstrand · July 10, 2017, 9:05pm

Oh yes, apparmor upstream has regretted that location for some time, but it is at least a decade old decision at this point and there is a lot of inertia at this point. :\

jdstrand · July 10, 2017, 9:09pm

Yeah, that is why it wasn’t moved yet-- it isn’t a bad place for it to be now and if moved we would need another systemd unit/etc to make sure it was loaded, which begs the question why we are duplicating that work. I know there was some talk of moving the profile out of /etc/apparmor.d for other reasons, so thought I’d mention it.

zyga-snapd · July 24, 2018, 4:55pm

Hey.

I wanted to open a new thread about this but then I noticed we already have one here:

@ogra and I just debugged this issue and we believe to have an explanation:

There are two separate issues here, well maybe three if you include the behavior of apparmor caching layer:

The kernel doesn’t write timestamps (last write time, last mount time) to ext4 superblock if the filesystem is remounted as ro before being finally unmounted

This is what causes our ext4 superblock timestamps to effectively stay frozen in time. When the system shuts down the / is remounted ro and then unmounted. This means we don’t write those timestamps (just those in the superblock).

On a device like the Raspberry Pi that don’t have a battery backed RTC we use the last mount time as initial time. Apart from the fact that this is a constant that doens’t change in our images this has another issue. It is actually older than last write time (once that timestamp functions correctly) and it may put the system “into the past” with regards to the timestamps on the apparmor profile source file and apparmor profile cache file. Depending on how the cache works this may cause us to load the cache even when it is stale. The fixrtc script should use the last write time, falling to last mount time if the former is unset.
The cache system should be content based, not date based but we already know that.

For reference. Ext4 superblock write logic: https://github.com/torvalds/linux/blob/ca04b3cca11acbaf904f707f2d9ca9654d7cc226/fs/ext4/super.c#L4813

This was tested on a Fedora 29 VM with a loopback device and ext4. Remounting the filesystem as read-only before unmounting the filesystem results in wrong data in the superblock (timestamps are not updated to reflect reality, e.g. written files on disk are more recent than “last write time”). It was also tested on a raspberry pi running ubuntu core and a 4.4 kernel when loading an apparmor profile without considering the cache resulted in correct behavior but rebooting the system loaded stale cache entry. It was also tested there by observing the output of “dumpe2fs” on the root filesystem. It shows that the system perpetually believes it is the moment the image was first created (when the filesystem was formatted).

https://paste.ubuntu.com/p/pJqTTZzYkg/ (the log was taken on on 24th of July, the dates refer to 9th of July due to the ext4 superblock behavior)

zyga-snapd · July 25, 2018, 1:23pm

So some more thoughts on this bug. On a Raspberry Pi we have no clock so on first boot we just set the time to the moment when the ext4 on /writable was first created. Then we mount / and carry on. This sets the last-mount-time of / to a few seconds after the image was created. Then we reboot and we set the time to last-mount-time of / which is really very close to the time when the system was created… see the problem? This will continue forever, “slowly” ticking seconds between each boot. This explains the pastebin above where the time is roughly 12 minutes after the moment when the filesystem was created (after about two weeks of real time and 18 reboot cycles).

I’m looking into apparmor caching logic now. I will update this answer when I have more data.

zyga-snapd · July 25, 2018, 2:24pm

Just a comment here: https://github.com/zyga/snapd/commit/b186de2e3c9285eec1b121bc1f47621a67b8d6fc

mvo · July 25, 2018, 2:53pm

This affects refreshes from older version to 2.34.2 so we may hold the release.

jdstrand · July 25, 2018, 3:03pm

The fixrtc logic is clearly wrong. It was stated elsewhere that on these devices the clock can only be reasonably expected to be accurate after network time is fetched. Why not simply adjust so that after the time is fetched, that date is stored off for fixrtc to use on the next boot?

ogra · July 25, 2018, 4:45pm

The original purpose (back in 2011/12) was to prevent mount failures due to “last mount time is in the future” by forcing the clock to the filesystem creation (or if available, “last mount”) date.

see above … to read any “stored off data” you would have to have a filesystem mounted, the fixrtc script was originally designed to fix a mount problem, so it runs prior to mounting any filesystem.

@abeato has worked on a post-mount enhancement for it at Ubuntu Pastebin that could be merged (though it is very snap centric and would have to go into initramfs-tools-ubuntu-core instead of initramfs-tools itself or have to be made more generic).

Alternatively there is a “fake-hwclock” package in raspbian that could be pulled into core and used (i think that stores a timestamp on disk and simply gets started from an early systemd unit)

jdstrand · July 25, 2018, 5:05pm

I was perhaps a bit short on my response. It with the systemd remount-ro bit is wrong for it to be involved with relying on it to set something remotely close to the actual time.

zyga-snapd · July 25, 2018, 5:06pm

I haven’t found the part that does this yet but it explains the lack of change to last-write-time.

ogra · July 25, 2018, 5:16pm

Right this is why i propose to use alfonsos fixrtc enhancement, it simply does a “stat -c %Y /var/lib/snapd/snaps”, that will get you a pretty accurate date being the relative modification time of that folder against the epoch (this should definitely fix the apparmor cache issue, given the profile file comes with a snap that was installed in this folder).

zyga-snapd · July 26, 2018, 12:50pm

I now believe to fully understand the situation where a de-facto stale cache is used.

It is 2018, a core snap update triggers system reboot
it is 2016 now, because we don’t have a clock across reboot and fixrtc is broken
System startup code runs apparmor.service, this loads cached profiles because cache.mtime >= source.mtime - those profiles refer to snapctl at /usr/bin/snapctl
Service startup triggers system key regeneration - all profiles are re-written, referring to snapctl at /usr/lib/snapd/snapctl
apparmor_parser is used to load changed profiles but, because the cache files are from 2018 they are used without referencing updated source files (which date to 2016 since we don’t have good time yet)
NTP sync the time, it is 2018 again
Manually loading the profile (normally cache is not enabled so that part is avoided) “fixes” the problem ogra saw

I will prepare a minimal patch that only uses cache files if mtime matches exactly. This is correct since mtime of cache files is synchronized to source file mtime when a cache entry is written.

zyga-snapd · July 26, 2018, 1:20pm

Thinking about how to solve this issue knowing that there are multiple source files due to an include mechanism. I think we can keep the existing check (mtime of all the source files must be < mtime of the cache) and then add an additional check where we also verify that mtime of the 1st source file is identical to the mtime of the cache. Since the mtime of the cache is set to the mtime of the source this feels correct.

jdstrand · July 26, 2018, 1:35pm

That won’t work in all cases (though would for many) since the parser will actually look at and use the newest mtime between the profile and any of its include files. Eg, if the base abstraction is newer than the profile, its mtime will be used. This might not be a problem in practice, but you would need to think through it.