Snapcraft adt failures with the new core release

ubuntu@xenial:/root$ dmesg | grep DENIED
[ 9775.971367] audit: type=1400 audit(1510844464.690:88): apparmor="DENIED" operation="file_inherit" namespace="root//lxd-trusty_<var-lib-lxd>" profile="/sbin/dhclient" name="/dev/pts/4" pid=11638 comm="dhclient" requested_mask="wr" denied_mask="wr" fsuid=165536 ouid=165536
[ 9775.971544] audit: type=1400 audit(1510844464.690:89): apparmor="DENIED" operation="file_inherit" namespace="root//lxd-trusty_<var-lib-lxd>" profile="/sbin/dhclient" name="/dev/pts/4" pid=11638 comm="dhclient" requested_mask="wr" denied_mask="wr" fsuid=165536 ouid=165536
[ 9871.850641] audit: type=1400 audit(1510844560.570:94): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxd-artful_</var/lib/lxd>" name="/sys/fs/cgroup/unified/" pid=13442 comm="systemd" fstype="cgroup2" srcname="cgroup" flags="rw, nosuid, nodev, noexec"
[ 9873.026539] audit: type=1400 audit(1510844561.745:95): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="lxd-artful_</var/lib/lxd>" name="/var/lib/lxcfs/" pid=13721 comm="(networkd)" flags="ro, nosuid, nodev, remount, bind"
[ 9878.090184] audit: type=1400 audit(1510844566.809:96): apparmor="DENIED" operation="file_lock" profile="lxd-artful_</var/lib/lxd>" pid=13962 comm="(ostnamed)" family="unix" sock_type="dgram" protocol=0 addr=none
[ 9878.090193] audit: type=1400 audit(1510844566.809:97): apparmor="DENIED" operation="file_lock" profile="lxd-artful_</var/lib/lxd>" pid=13962 comm="(ostnamed)" family="unix" sock_type="dgram" protocol=0 addr=none
[ 9878.090196] audit: type=1400 audit(1510844566.809:98): apparmor="DENIED" operation="file_lock" profile="lxd-artful_</var/lib/lxd>" pid=13962 comm="(ostnamed)" family="unix" sock_type="dgram" protocol=0 addr=none
[ 9878.090198] audit: type=1400 audit(1510844566.809:99): apparmor="DENIED" operation="file_lock" profile="lxd-artful_</var/lib/lxd>" pid=13962 comm="(ostnamed)" family="unix" sock_type="dgram" protocol=0 addr=none
[11221.481436] audit: type=1400 audit(1510845910.202:117): apparmor="DENIED" operation="file_inherit" namespace="root//lxd-xenial_<var-lib-lxd>" profile="/snap/core/3440/usr/lib/snapd/snap-confine//snap_update_ns" name="/dev/null" pid=17805 comm="5" requested_mask="r" denied_mask="r" fsuid=165536 ouid=0
I see a number of failures here: The file_inherit of /dev/null is one interesting aspect. Should we adjust the profile for LXD / snapd somehow?

I’m not sure why you’d want to inherit an fd for /dev/null or /dev/pts/<idx> which must be the fd lxd currently uses from the host for its exec session. The AppArmor denies seem reasonable to me. (Apart from the cgroup2 deny but that’s probably apparmor not knowing about cgroup2.

On second thought the helper that is doing the mount namespace capture (bind mount) is probably a red herring. I’ll test a tweak to see what happens.

What about the various mount denials?

I turned of kernel rate limiting by running sysctl kernel.printk_ratelimit=0 and patched the profile to allow access to /dev/null but I don’t see any denials and the error is exactly the same as before.

I turned of kernel rate limiting by running sysctl kernel.printk_ratelimit=0 and patched the profile to allow access to /dev/null but I don’t see any denials and the error is exactly the same as before.

That is orthogonal to what we are discussing here I think. :slight_smile:

snap-confine itself seems to be running under an AppArmor profile as indicated by:

https://github.com/snapcore/snapd/blob/aca5f623b4c45a0950a4e8a38fbcbc2883238dac/cmd/snap-confine/snap-confine.c#L157

Where is this profile?

Oh ok, I see it just checks whether it runs confined it seems.

The profile is in snap-confine.apparmor.in in the source tree or in /etc/apparmor.d/*snap-confine.real

NOTE: There will be two files because snapd re-executes itself so one will be for the packaged version and one will be for the snapd-from-core-snap.

Your AppArmor profile seems to be allowing to read the freezer cgroup but not write to it, am I right?

Hmm maybe I read my apparmor wrong but this should say we can open the freezer directory and write and create the snap.* sub-directory there. If this was failing it would fail outside of LXD as well.

A minimal sample, extracted from cgroup-freezer-support.c, that fails for me: https://paste.ubuntu.com/25975101/

Are you running it as root or setuid?

I actually tried this:

  1. build with gcc -DDO_WAIT..
  2. move the binary to the container’s /, chown root:root, chmod u+s
  3. run it, it will stop waiting for SIGCONT
  4. go back to host, find the test binary’s pid in global namespace
  5. while on the host trace-cmd -e syscalls -e cgroup -e fs -p function -P <global-pid>
  6. back in the container kill -CONT <pid>, see cannot create freezer cgroup hierarchy: Permission denied
  7. back to the host C-c, trace-cmd report … and get a segfault in trace-cmd :slight_smile:

I’m running with setuid

Ah ffs, I know what’s going on. So obvious that I didn’t even consider it. You’re running setuid but not setgid, right? By that I mean snap-confine.

Yes, we are not setgid

Yes, adding g+s makes the minimal sample succeed, and will probably fix snap-confine too (although I have not tried it with snap-confine, @zyga-snapd did you?)

Yes I did try that and it does fix the issue.

The question remains: should LXD set up cgroups to be root owned and root writable or should snap-confine have to be setgid.

The reason why we keep the owner as the container’s owning uid rather than uid 0 in the container is that without this, the owner of the container (especially if it’s an unprivileged user) would loose the ability to control the container’s cgroups.

So should we just make snap-confine g+s CC: @jdstrand

If that’s a possibility for you, yes. I just tested this on our side again for unprivileged container’s you’d effectively loose the ability to attach to the container as an unprivileged user because you can’t move yourself into the cgroup.

Hmm, can you expand on that please. Does that mean regular use of LXD will be broken?

Would it be any better if we didn’t chown? (Note that when we are g+s then the file will be root.root anyway, right?)