Snapcraft adt failures with the new core release

I’ll have a look first thing tomorrow.

Alright here we go, final reproducer. Still no idea what the issue is, but I’ve narrowed it down a lot.

  1. Create a new container:

    $ lxc launch ubuntu:xenial snap-test -e
    
  2. Shell into that container (the rest of the steps are run within that shell):

    $ lxc exec snap-test su - ubuntu
    
  3. Update it and install prereqs:

    $ sudo apt update && sudo apt upgrade -y && sudo apt install squashfuse -y
    
  4. Install the hello-world snap:

    $ sudo snap install hello-world
    
  5. Run it as the unprivileged user:

    $ hello-world
    cannot create freezer cgroup hierarchy for snap hello-world: Permission denied
    
  6. Run as sudo, and realize that doing so unlocks the unprivileged user as well:

    $ sudo hello-world 
    Hello World!
    $ hello-world
    Hello World!
    

So: still no idea what’s happening, but it seems that something privileged needs to happen before starting a snap application. It apparently only needs to happen once. Note that I can now uninstall/reinstall hello-world and it works as expected. However, other snaps need the same privileged operation:

$ sudo snap install hello
hello 2.10 from 'canonical' installed
$ hello.universe 
cannot create freezer cgroup hierarchy for snap hello: Permission denied
1 Like

Thanks @kyrofa for the excellent instructions to reproduce the issue. It turns out that our lxd test did not run the test snaps as a regular user, just as root. This is why we did not catch this issue. In https://github.com/snapcore/snapd/pull/4230 the test is extended now to ensure we do not hit this ever again.

FWIW, trying the same on artful host.

The log from running hello:

ubuntu@snap-test:~$ SNAPD_DEBUG=1 SNAP_CONFINE_DEBUG=1 hello
2017/11/16 07:59:06.603740 cmd.go:203: DEBUG: restarting into "/snap/core/current/usr/bin/snap"
DEBUG: security tag: snap.hello.hello
DEBUG: executable:   /usr/lib/snapd/snap-exec
DEBUG: confinement:  non-classic
DEBUG: base snap:    core
DEBUG: apparmor label on snap-confine is: /snap/core/3440/usr/lib/snapd/snap-confine
DEBUG: apparmor mode is: enforce
DEBUG: checking if the current process shares mount namespace with the init process
DEBUG: re-associating is not required
DEBUG: creating lock directory /run/snapd/lock (if missing)
DEBUG: opening lock directory /run/snapd/lock
DEBUG: opening lock file: /run/snapd/lock/.lock
DEBUG: sanity timeout initialized and set for three seconds
DEBUG: acquiring exclusive lock (scope (global))
DEBUG: sanity timeout reset and disabled
DEBUG: ensuring that snap mount directory is shared
DEBUG: unsharing snap namespace directory
DEBUG: creating namespace group directory /run/snapd/ns
DEBUG: namespace group directory does not require intialization
DEBUG: releasing lock (scope: (global))
DEBUG: creating lock directory /run/snapd/lock (if missing)
DEBUG: opening lock directory /run/snapd/lock
DEBUG: opening lock file: /run/snapd/lock/hello.lock
DEBUG: sanity timeout initialized and set for three seconds
DEBUG: acquiring exclusive lock (scope hello)
DEBUG: sanity timeout reset and disabled
DEBUG: initializing mount namespace: hello
DEBUG: opening namespace group directory /run/snapd/ns
DEBUG: attempting to re-associate the mount namespace with the namespace group hello
DEBUG: successfully re-associated the mount namespace with the namespace group hello
DEBUG: releasing resources associated with namespace group hello
cannot create freezer cgroup hierarchy for snap hello: Permission denied

The step that fails is creating snap.hello freezer cgroup. sudo works only until a reboot.

Some updates from IRC:

10:20 < zyga> mvo: I know what the problem is, I think
10:20 < zyga> stgraber: around?
10:23 < zyga> mvo: http://pastebin.ubuntu.com/25973198/
10:24 < zyga> mvo: so two ideas for quick “solution”:
10:24 < zyga> mvo: 1) make that step optional and let it fail, this means mount changes are not atomic in lxd
10:24 < zyga> mvo: 2) talk to stgraber and figure out why lxd sets up containers this way and what we can do about it
10:26 < zyga> mvo, mborzecki: opinions?
10:26 < mborzecki> hmm that’s why g+s works i guess
10:27 < zyga> mborzecki: correct
10:29 < zyga> mvo: it’s actually a deeper problem:
10:29 < zyga> root@my-ubuntu:~# ls -ld /sys/fs/cgroup/devices/
10:29 < zyga> drwxrwxr-x 5 nobody root 0 Nov 16 09:13 /sys/fs/cgroup/devices/
10:29 < zyga> mvo: unless we somehow disabled all udev tagging inside containers
10:30 < zyga> mvo: it will break on when creating the device cgroup
10:30 < zyga> s/on//
10:31 < mborzecki> that’s perhaps silly, but could we g+s snap-confine too? we’re dropping both anyway right after setting up
10:32 < zyga> mborzecki: that’s another option, indeed
10:32 < zyga> though I’d like to understand the motivation behind lxd choices

Maybe want to take a look at this related LP: #1730376 issue discussed on Problems with build-snaps on build.snapcraft.io which hasn’t had a comment from anyone in the snapd team yet.

Hey @sergiusens - I had a quick look but I don’t see how this is related. Can you clarify please.

Only picking on keywords, you mentioned udev; apparently launchpad builders which run on lxd cannot install the core snap due to issues with no udev being installed as a dependency to snapd.

1 Like

The way LXD sets up the cgroups is sufficient to do cgroup delegation. But cgroup delegation is - as it is on your host system - restricted to the root user of the container. If an unprivileged snap user needs to write to a cgroup then snappy needs to take care to do the cgroup delegation for them. That is, it needs to place the user into a writable cgroup. This is not LXD’s job though. When LXD sets up the container it doesn’t even know what users will exist on your system apart from the root user. This is something that snappy will need to take care of either during install by creating some sort of snappy group or by using our pam module libpam-cgfs which can be used to create writable cgroups on login. The latter would require you to run lxc exec <container-name> -- su -l - ubuntu and then snappy would need to take care to create the cgroup for itself in the current writable cgroup that the unprivileged user is placed in. The latter option however is to some extent distro dependent.

1 Like

Question about the delegation and being a privileged user or not. Snappy uses a setuid-root executable that writes to /sys/fs/cgroup and then drops back to the user.

What we are being blocked on now, I think, is the ownership of /sys/fs/cgroup (nobody) that differs from the regular permission outside of LXD (root).

Lastly is the root user inside the container an unprivileged user? Should it be able to bypass the nobody ownership?

Snappy manages many cgroups, one per snap, dynamically as the corresponding applications are started. I’m not sure if I understand you correctly and if we can still do that.

Question about the delegation and being a privileged user or not. Snappy uses a setuid-root executable that writes to /sys/fs/cgroup and then drops back to the user.

Well I would expect that to work but I’m not familiar with the internals.

What we are being blocked on now, I think, is the ownership of /sys/fs/cgroup (nobody) that differs from the regular permission outside of LXD (root).

What confuses me about this is why that should matter if you’re using a setuid binary. Giving the container’s root user’s group write access to the cgroup hierarchy is sufficient for it to create it’s own cgroups underneath. Otherwise sudo - setuid too - wouldn’t work as well.

Lastly is the root user inside the container an unprivileged user? Should it be able to bypass the nobody ownership?
The root user inside the container is the root user and therefore privileged wrt to the container. Wrt to the host it is an unprivileged user. But that doesn’t matter since LXD/liblxc will take care to delegate a cgroup to the user on the host.

Thank you for the answer. I will investigate this and see if there’s something we are missing.

BTW: can you tell me (or point me to some docs) about cgroup delegation feature of lxd?

Thank you for the answer. I will investigate this and see if there’s something we are missing.

No problem, I’m going to take a look at the cgroup portion of your code.

BTW: can you tell me (or point me to some docs) about cgroup delegation feature of lxd?

Cgroup delegation in liblxc (that’s what does it for LXD) for cgroup v1 hierarchies is:

  • give write access to the revelant cgroup hierarchy created by liblxc for the container by:
    1. chow()ning the cgroup directory gid to the container’s root user’s id
    2. chow()ning the cgroup.procs file’s gid to the container’s root user’s id
    3. chow()ning the cgroup.tasks file’s gid to the container’s root user’s id

The model is different for cgroup v2 but that is out of scope now anyway. :slight_smile:

Thank you!

Note that the cgroup code is in two places:

  • the device cgroup code is in cmd/snap-confine/udev-support.c in the snapd tree
  • the freezer cgroup code is in cmd/snap-update-ns/freezer.go

I see what you’re doing is:

if (fchownat(hierarchy_fd, "", 0, 0, AT_EMPTY_PATH) < 0) {
		die("cannot change owner of freezer cgroup hierarchy for snap %s to root.root", snap_name);
}

I don’t understand why you want to chown the /sys/fs/freezer cgroup itself. I think this is where you fail. That shouldn’t be needed for you to create writable cgroups. It’s sufficient if you can chown it to the relevant gid.

That code is AFAIK not used as freezer control moved to snap-update-ns. To answer your question though: We chown directories that we created to ensure that we don’t leak the group of the user that initially ran the command that triggered us to create the cgroup. Otherwise those would be root:zyga, for example. We never chown things we didn’t create so we should not change /sys/fs/cgroup/freezer itself.

EDIT: I’m sorry for a rush response: We do use freezer cgroup from both sides. The C side just moves the process there. The go side handles the actual freezing.

Np, then this is the crucial step I think. What you want to do is to only check whether you can write to the cgroup. You shouldn’t need to chown the freezer cgroup itself I’d say. Just check if you can write to it and if you can use it if not don’t use it (If that’s possible for you.).

I’m obviously blatantly ignorant about some of your requirements. So my advice obviously needs to be checked against yours. :slight_smile:

I’m not sure I understand. Are you referring to chowning of /sys/fs/cgroup/freezer/snap.example or /sys/fs/cgroup/freezer

No I was likely to hasty. Does the C code that creates the cgroup in https://github.com/snapcore/snapd/blame/b981a864fa3d6193975e167018e8edc93c984b30/cmd/libsnap-confine-private/cgroup-freezer-support.c#L34 run as setuid()?