Phase 1 of 'Opt-in per-snap users/groups' (aka, the 'daemon' user)

jdstrand · March 28, 2019, 2:47pm

Summary

The ‘Multiple users and groups in snaps’ topic outlines various problems related to users in snapd. The most requested feature from this topic is allowing the ability for a snap service to privilege drop to a specific user. As described from the other topic:

Opt-in per-snap users/groups

A lot of existing applications are designed with the notion of privilege separation and/or permanently dropping privileges to secure their code. For example, postgresql, mysql, apache, nginx, etc. Some want to start as root to bind to a port and immediately permanently drop privileges and others want to fork processes under another user to (for example) handle untrusted input.

Today, all these daemon applications must run as root and are not allowed to drop privileges. While the security policy will keep the system safe and will keep the applications isolated, the security stance of the applications themselves is lessened because their security mechanisms can’t be used under snappy (eg, consider an application that handles untrusted input that would normally run a process under a separate user-- under snappy it is the same user so if there is a bug in processing the untrusted input, that process is able to attack the main application).

For the first phase we want to introduce a single user that snaps can use. We’ve chosen the ‘daemon’ user which is defined by the LSB as a user and group that services may drop to. Furthermore, “while a shared user/group, this is no worse than the shared ‘root’ user/group and will allow applications to leverage their security mechanisms for up to one group.”

Note that snapd “should allow applications to use multiple users and groups. For example, it would be desirable for a complex snap that uses the LAMP stack to be able to let apache drop to a different user than mysql is dropping to so that successful attacks against the apache process don’t give access to mysql’s resources.”. This is out of scope for phase 1.

Design

‘Multiple users and groups in snaps’ lists many details for how everything is meant to work together in the final implementation, but phase 1 will implement the relevant portions of the developer experience as documented but the backend implementation underneath will implement just enough to support one use case, the LSB-defined and required ‘daemon’ user.

From the other topic, the idea is that the snap would declare:

name: my-app
...
shared-users:   # or global-ids (TBD)
- foo

then snapd would create the snap_foo user and group and the snap may use foo within the runtime of the snap (because an NSS module will translate foo to snap_foo and give back the uid/gid for snap_foo; it is expected that the snap could always simply use snap_foo if desired). It is understood that there are cases (eg, the lxd and docker snaps) where we don’t want to create a snap_-prefixed user, which is covered by the idea of system-users/system-global-ids.

Because phase 1 will not implement the NSS module and the daemon user already exists, the simplest option would be to expose this user via the system-users/system-global-ids mechanism like so:

name: my-app
...
system-users:   # or system-global-ids (TBD)
- daemon

In terms of implementation, the following will happen:

because the daemon user already exists (see discussion points), we don’t need to create it, but we do want to perform a uid/gid lookup for inserting into the policy
abstract out the existing family of setuid/setgid system calls such that when opt-in users are not specified, these are added. This means that if snaps do not use the opt-in user feature, they get the same seccomp/apparmor policy we’ve always provided
when a snap specifies using the feature, a new set of seccomp/apparmor rules is added to support privilege dropping/chown/chgrp
snap-confine is updated to ensure it can still drop to the non-root calling user (this is needed because currently the drop in snap-confine happens after the seccomp policy is loaded)

Discussion points:

the system-users/system-global-ids concept necessarily uses the system’s passwd/group databases and therefore the uid/gid is not guaranteed to be the same for the daemon user across distributions. This is only a problem for the “Use case 4: Chroot environments” which is out of scope of phase 1. IME, system-users/system-global-ids cannot support the chroot use case; in the full solution we will simply document this and say that shared-users/global-ids must be used for the chroot use case. I’ve added a note for this to the other topic
Solus is known to not be LSB-compliant and doesn’t have the daemon user on the system (UPDATE: we will perform runtime detection and surface to the user)

UPDATE 2019-05-17:

The original proposal used global-ids and system-global-ids but after discussion these were changed to shared-users and system-users, respectively (in each case we kept the original proposal term as a reference)
as part of implementing many: support system-users for 'daemon' user by jdstrand · Pull Request #6681 · canonical/snapd · GitHub, it was decided that snapd would detect at runtime if the daemon user was available to use, and it not, at snap install bubble up that the snap is uninstallable (with instructions on next steps)

jdstrand · March 28, 2019, 2:56pm

This could be addressed in several ways (perhaps there are others):

adjust the snapd build system to check for the existence of the ‘daemon’ user, and fail the build with instructions that the packaging should create this user as part of installation. This does not address systems like Solus that supports reexec
have snapd verify that the system has the ‘daemon’ user, and if not, fail the installation of a snap that uses it (with instructions for the user on how to add the user)
at runtime, create the ‘daemon’ user if it doesn’t exist
don’t use system-global-ids at all in phase 1 and instead implement enough of global-ids by creating a user backend to create the ‘snap_daemon’ user. If we do this, we could implement a handful of these users such that snaps could use any/all of them and have different services drop to different users (ie, the LAMP example). UPDATE: the NSS module would not be implemented and publishers would need to use snap_daemon to drop while specifying daemon in their yaml. The final implementation will introduce the NSS module so the snap can later drop with simply daemon)

jdstrand · March 28, 2019, 3:12pm

jdstrand:

Solus is known to not be LSB-compliant and doesn’t have the daemon user on the system

This could be addressed in several ways (perhaps there are others):

adjust the snapd build system to check for the existence of the ‘daemon’ user, and fail the build with instructions that the packaging should create this user as part of installation. This does not address systems like Solus that supports reexec

have snapd verify that the system has the ‘daemon’ user, and if not, fail the installation of a snap that uses it (with instructions for the user on how to add the user)

at runtime, create the ‘daemon’ user if it doesn’t exist

don’t use system-global-ids at all in phase 1 and instead implement enough of global-ids by creating a user backend to create the ‘snap_daemon’ user. If we do this, we could implement a handful of these users such that snaps could use any/all of them and have different services drop to different users (ie, the LAMP example). UPDATE: the NSS module would not be implemented and publishers would need to use snap_daemon to drop while specifying daemon in their yaml. The final implementation will introduce the NSS module so the snap can later drop with simply daemon )

I’m not sure what to recommend as each has its advantages and disadvantages. ‘1’ is not at all unusual when considering the traditional packaging world (deb/rpm/etc) but is slightly presumptuous since ‘daemon’ isn’t snapd-specific. ‘2’ is slightly unfriendly but not much different conceptually from assumes (perhaps this is assumes: daemon-user?). ‘3’ is also slightly presumptuous since ‘daemon’ isn’t snapd-specific as with ‘1’. ‘4’ is a little more work (but not a lot more) but adds a small friction point for early adopters since the NSS module won’t be in place.

jdstrand · March 28, 2019, 3:18pm

I think I slightly prefer this one.

pedronis · April 2, 2019, 4:34pm

@jdstrand the overall plan seems reasonable for the first phase. If I understand there is no deep security issues to let the snap run as the system daemon user because the same confinement considerations apply as when it’s running as root?

We need to think a bit more about the snap.yaml/snapcraft.yaml syntax though, it’s not clear what “ids” refer to in all those stanza keys (Also user ids etc are normally intended as numeric?).

jdstrand · April 2, 2019, 5:48pm

Yes. root is a shared user and group, no different than ‘daemon’ in that regard and our security sandbox mediates cross-snap access. Allowing dropping to this non-root user allows for the processes in the snap to realize security benefits associated with DAC and dropping capabilities.

jdstrand · April 2, 2019, 5:54pm

I implemented system-global-ids in the preliminary many: support system-users for 'daemon' user by jdstrand · Pull Request #6681 · canonical/snapd · GitHub since it was easy to implement, undo and/or change. I’m happy to change that to something else or do something else as warranted.

Note, the concept and naming of global-ids was something that @niemeyer suggested and was discussed with me, @mvo, et al. Again, fine to change. The concept is that the snap declares one name that represents a user/group pair. So one specifies daemon and can use both the uid for the daemon user and the gid for the daemon group (as opposed to having to specify both).

jdstrand · April 2, 2019, 5:59pm

Furthermore, with global-ids this user/group pair is prefixed with ‘snap_’ and snapd is expected to create them. With system-global-ids this user/group pair is not prefixed and snapd will create the system user and group on demand if they don’t already exist (similar the deb/rpm/etc traditional packaging world).

pedronis · April 2, 2019, 6:14pm

@jdstrand I have some questions about the overall larger design, but these might influence what we do for the first phase

it seems if there is no snap-declaration allowing for a global-ids user then the behavior reverts back to private-ids ?
I assume we are agreeing that daemon is a system-global-ids that all snap get access though without declaration? in the spirit of the purpose of the user?
what happens with system-global-ids if they are not allowed by the snap-declarations and a snap using them is getting installed?
what happens to an already existing installation of a snap if the declaration gets granted at a later point (either for global-ids or system-global-ids) ?
what should happen if between revisions a user moves across the kind of possible support users (global, vs system-global, vs private) ?

jdstrand · April 2, 2019, 7:15pm

Please note, the concept of private-ids is not yet an approved part of the spec and just an idea for a future enhancement. That said, since the point of the private IDs is that the uid/gids are guaranteed not overlap with other snaps, I don’t think that we would ever want to fallback to this behavior, since two snaps might inadvertently pick the same non-existent IDs and fallback to the same private ones on the system (or, perhaps worse, the second snap fails to install).

We never explicitly discussed the daemon user in the context of the full spec but rather as a way to implement something useful without the full spec. In that light, I think it most aligns with system-global-ids, yes, since it is a) a system user, b) is not prefixed with ‘snap_’ and c) has to deal with the fact that the uid/gid for this pair may be different depending on the system. Furthermore, yes, the LSB defines the daemon user as “The daemon User ID/Group ID was used as an unprivileged User ID/Group ID for daemons to execute under in order to limit their access to the system.” (it is considered a legacy user by the LSB since they promote per-application users/groups (which we will handle with global-ids), but is still required and the user was specifically designed for this).

Because the user is designated for this and because the shared daemon user/group only provides security benefits to snaps over the current shared root user/group user, yes, no snap declaration would be required for daemon.

In terms of the daemon user, there is no snap declaration needed, so nothing special here.

The full spec states (which as mentioned, doesn’t discuss the daemon user) that a snap declaration would need to be issued to use anything in the global ID database, so where a snap declaration is warranted (see below) I think we treat this like super-privileged interfaces where unasserted installs are allowed, otherwise they are not without the corresponding snap declaration. I think this is probably the most reasonable approach because a snap would probably break in significant ways if we allowed install but not privilege dropping.

That said, I suspect we actually want a few different users that don’t require a snap declaration. Eg, I can imagine:

system-global-ids: [ docker ]: needs a snap declaration
system-global-ids: [ lxd ]: needs a snap declaration
global-ids: [ apache ]: needs a snap declaration
global-ids: [ mysql ]: needs a snap declaration
global-ids: [ www, db ]: does not need a snap declaration

The idea is that we can prepopulate the global ID database with a few generic ID pairs (eg, ‘www’ and ‘db’) that anyone can use, but require a snap declaration for non-generic users that are somehow tied to the publisher or software. system-global-ids always requires a snap declaration since it is typically meant to be unique for the snap’s functionality (ie, lxd protects the lxd socket; except for the case of daemon where it is expressly meant to be shared).

If we take my suggestion that we fail the install until the snap declaration is issued, the snap suddenly is installable. For snaps that add the global-ids/system-global-ids later, they aren’t refreshable until after the snap declaration is issued (I’m assuming this is how it works with super-privileged interfaces today).

This is an interesting question. Since snap declarations are per-snap and not per-revision, I don’t think we’d want to revoke and grant from one to another immediately, since this would affect rollbacks, etc. If we did immediately revoke one/grant another, in theory on refresh things could be ok because we (intentionally) don’t have ‘owner’ apparmor file rules on SNAP_DATA or SNAP_COMMON so in theory a snap could refresh and access the data then update the permissions for the newly granted IDs, but that is brittle and doesn’t support rollback as well (since the rolled-back-to revision would need to have this same logic, which it’s unlikely people will predict the future in this manner). I think the answer is that we work with the publisher and allow an overlap in the snap declaration to support migration cases.

Another thought I had in responding here was that I suspect that most publishers are going to be very satisfied with the daemon user and any future generic global-ids. There will be a not insignificant but probably not very large set of snaps where big name publishers are going to want their own id (system-global-id or global-id). Since the non-generic global-ids require a snap declaration, they are in many ways quite private to the publisher, so the number of users requesting private-ids will be even smaller.

jdstrand · April 2, 2019, 9:16pm

I looked at this a bit further since there is a difference in that CAP_SETUID, CAP_SETGID and CAP_CHOWN are needed for snaps to do common operations related to privilege dropping and while the snaps were running as root before, they did not have these apparmor rules. With the rules in place (see man capabilities):

CAP_SETUID
- “Make arbitrary manipulations of process UIDs (setuid(2), setreuid(2), setresuid(2), setfsuid(2))”. This is fine; we are using seccomp fine-grained mediation for all but setfsuid, and we don’t allow setfsuid
- “forge UID when passing socket credentials via UNIX domain sockets” (see man unix for details) - we don’t mediate socket credentials, but we do mediate sockets so this is mostly handled in the current policy where it is largely expected that if the policy allows connecting to the socket, the access is expected to be allowed. Some services are going to do root vs non-root checks, but the snap is already allowed to run as root so there is no need for forgery. The execve() boundary and privilege dropping drops this capability for non-root snap processes, so they can’t forge. There are some theoretical cases where a service is trying to limit root access and a root snap process could forge a different UID for the access check, but this isn’t effective protection for the service since typically root is unconfined and could manipulate the service in other ways. This deserves a code comment though
- “write a user ID mapping in a user namespace (see user_namespaces(7))” - we don’t allow the writes to /proc/*/uid_map so this is mediated elsewhere
CAP_SETGID - same as for CAP_SETUID, but for gid operations
CAP_CHOWN - “Make arbitrary changes to file UIDs and GIDs (see chown(2))” - This is fine and what we mediate with our fine-grain seccomp filter

In addition to the capabilities manpage, I also found:

CAP_SETUID
- man keyctl: allows getting the persistent keyring for a user. We block the keyctl syscalls so this is fine. AppArmor will also gain kernel keyring mediation in the mid-term

Since we either provide the intended fine-grained mediation or are mediating the access in other ways, all of the above is fine and expected. The outlier is the unix socket cred forgery when our interfaces allow the access to the socket in the first place, but in practice this is not an issue.

pedronis · April 3, 2019, 11:44am

ok, I probably skimmed the original proposal too quickly and misread something.

You are saying that in order of usefulness we should implement:

system-global-ids: daemon
global-ids: [www, db] etc
declaration needing system-global-ids, global-ids
maybe private-ids

?

About the keys for these stanzas, the other similar high-level stanzas have names like apps, slots, plugs.

I was not involved in picking the *-ids names, what bothers me a bit is that what follow them are not ids but user/usernames, the names seem really contractions for:

global-ids = users/groups with globally assigned user/group ids
system-global-ids = users/groups with global system assigned user/group ids
private-ids = users/groups with privately (per-snap) assigned user/group ids

Or is the logic behind the names something else?

Couldn’t private-ids and system-global-ids just be called private-users and system-users ?

We are left with global-ids, I see a couple options there:

global-ids-users
shared-users

pedronis · April 3, 2019, 11:49am

would global-ids: [<snap-name>] need a declaration?

jdstrand · April 5, 2019, 7:57pm

Yeah, I think that is about right.

I think the rationale of ‘ids’ came up because with these while there is a convenient user/group name, in the case of global-ids and private-ids, these names map directly to a predefined uid/gid that is guaranteed to be the same regardless of the system the snap is installed on (because we’ll prefix the name with snap_) because that list of mappings is maintained by the store (ultimately; initially snapd). system-global-ids necessarily has to fudge this since, as in the case of daemon, but also lxd, docker, libvirtd, etc, the requested user/group name is not namespaced and therefore may exist and have a different uid/gid on different systems (just like users/groups in the traditional packaging world).

Sure. So:

private-users
shared-users
system-users

? This works for me. Let me know and I can update the spec.

jdstrand · April 5, 2019, 8:00pm

If by <snap-name> you mean ‘apache’ or ‘mysql’, then yes (but that is what I put before, so maybe you meant something else?).

pedronis · April 8, 2019, 8:23am

Yes, but please keep mentioning both options shared-users|global-ids-users for now in it, we can rediscuss when we get to implement that, my main point so far is for them ending in -users.

Sorry, my question was about the case where the requested user/group name matches the snap name, usually we think in terms of snaps owning that. But I was forgetting that here we also need to assign globally an id in the store to this, so my point is moot. If I understand correctly for the global ids case we need in the store somehow (how to be designed):

keep a global mapping of these users/groups to ids
allow single snaps use of some of those

jdstrand · April 11, 2019, 7:12pm

This is now done

jdstrand · April 11, 2019, 7:21pm

The spec mentions that the store will indeed ‘keep a global mapping of these users/groups to ids’, but in the early phases snapd can maintain its list (eg, initially ‘daemon’, later the common ‘shared-users/global-ids’ (ie, www, db, etc)). The store is probably best involved when we start needing the snap-specific ‘shared-users/global-ids’ (ie, apache, mysql), ‘system-users/system-global-ids’ and ‘private-users/private-ids’, in part because we can have the most flexibility in maintaining them and any snap declarations.

jdstrand · May 17, 2019, 5:47am

We decided this would be an “implied assumes” where we perform runtime checks to see if the daemon user is on the system, and if not, surface that to the user when installing a snap that uses the feature.

galgalesh · January 21, 2020, 9:44pm

Note: The currently implemented parts of this seem to be documented here: System usernames