Rethinking how we handle XDG_RUNTIME_DIR

jamesh · January 18, 2021, 10:53am

I’d like to propose some changes to the way we handle XDG_RUNTIME_DIR is handled by snapd. The current setup is this:

snap run sets XDG_RUNTIME_DIR=/run/user/$uid/snap.$SNAP_INSTANCE_NAME in the snap’s environment.
base AppArmor template grants snaps full access to this directory.
some interfaces grant access to some files in the real $XDG_RUNTIME_DIR.

This ends up causing a few problems that need to be papered over by desktop-helpers scripts or similar:

Nothing creates the private $XDG_RUNTIME_DIR. This violates the fd.o base directory spec, and has caused problems for a number of applications when not using desktop helpers.
Various libraries search for sockets in $XDG_RUNTIME_DIR, for instance Pulse Audio and Wayland.

While desktop-helpers have helped paper over these issues, it’s not so obvious what to do when the server side implementations are also provided by snaps. For example, just as the Wayland client libraries will look for the wayland-0 socket in $XDG_RUNTIME_DIR, the server side will attempt to create the socket in $XDG_RUNTIME_DIR.

So if I have a wayland-server snap, it will end up creating its socket as /run/user/$uid/snap.wayland-server/wayland-0. Here, I’ve got two options to make the socket available outside the snap:

write snap specific patches to the server to have it ignore $XDG_RUNTIME_DIR, and create its socket in /run/user/$uid.
write yet more helper scripts. In this case, perhaps fork a background process that waits for the socket to be created, and then hard links it into /run/user/$uid?

I think we could do better: solve the problem of sharing sockets between different snaps, and remove the need for some of the desktop-launch changes at the same time. I think most of these would be compatible with most existing desktop-launch scripts too.

Proposal

I think the original idea of giving each snap its own private $XDG_RUNTIME_DIR is a good idea, since it means we don’t need to restrict what the snap can put in that directory. I think we can preserve that while having the private directory appear at the standard location, similar to how snaps see their private temporary directory at the standard location.

While working adding xdg-desktop-portal support to snapd, I implemented a “user mounts” in snap-confine/snap-update-ns in order to mount the per-user document portal into a snap’s mount namespace. It hasn’t been used for anything else since, but I think it could help us here.

When we’re setting up the snap’s mount namespace we’d now do the following:

in github.com/snapcore/snapd/snap/snapenv, set XDG_RUNTIME_DIR=/run/user/$uid.
Ensure that /var/lib/snapd/hostfs/run/user/$uid/snap.${SNAP_INSTANCE} exists. Extend the existing AddModeHint functionality to ensure that the directory plus parents is created with the right ownership as well as permissions.
Add a user mount of /var/lib/snapd/hostfs/run/user/$uid/snap.${SNAP_INSTANCE} to /run/user/$uid.

Now we’ve still got a private XDG_RUNTIME_DIR that will be cleaned up with the user’s primary XDG_RUNTIME_DIR, but still appears at the standard location from the snap’s point of view.

Of course, at this point the snap cannot access anything from the real XDG_RUNTIME_DIR. This includes:

D-Bus session bus at $XDG_RUNTIME_DIR/bus
X11 auth cookie file, which may be at $XDG_RUNTIME_DIR/gdm/Xauthority or possibly other paths.
Wayland sockets that are usually at $XDG_RUNTIME_DIR/wayland-*
Pulse Audio socket at $XDG_RUNTIME_DIR/pulse/native
dconf update coordination in $XDG_RUNTIME_DIR/dconf/user, used the gsettings dconf backend.

The answer to most of these is more user mounts. In the case of data in XDG_RUNTIME_DIR subdirectories, this is pretty simple:

The x11 plug should add a user mount from /var/lib/snapd/hostfs/run/user/$uid/gdm to /run/user/$uid/gdm
The pulseaudio, audio-playback, and audio-record plugs should add a user mount from /var/lib/snapd/hostfs/run/user/$uid/pulse to /run/user/$uid/pulse. There should be some form of de-duplication for snaps that connect more than one of these interfaces.
The gsettings plug should mount /var/lib/snapd/hostfs/run/user/$uid/dconf to /run/user/$uid/dconf.

(note that we’re mounting via /var/lib/snapd/hostfs because the host system /run/user/$uid has been shadowed by earlier mounts).

For cases where sockets (or other non-directory files) are created directly within XDG_RUNTIME_DIR (e.g. D-Bus and Wayland), we can’t do simple directory mounts. There are two choices:

Touch an empty file in the private XDG_RUNTIME_DIR, and then bind mount the single file over the top of it.
Hard link the file into the private XDG_RUNTIME_DIR directly.

Option (2) seems somewhat simpler to me, but perhaps has some downsides I haven’t thought of. Absent appropriate AppArmor protections, both would allow nuisance attacks with chmod, for instance.

In either case, it would require snap-update-ns updates to support this (especially the globs for e.g. the wayland-* sockets).

Services provided by snaps

Once we’re in a position where snapd is constructing the snap’s private XDG_RUNTIME_DIR, it suddenly becomes significantly easier to handle the case of services provided by snaps rather than the host system.

Imagine that we have a version of the pulseaudio snap that runs the daemon as a user service. From the point of view of the host system mount namespace, the socket will be located at /run/user/$uid/snap.pulseaudio/pulse/native. If a client snap connects to pulseaudio:audio-playback instead of the implicit system:audio-playback slot, we can simply have the interface generate a mount from /var/lib/snapd/hostfs/run/user/$uid/snap.pulseaudio/pulse instead, which we can infer from the snap instance name of the slot.

Security Concerns

AppArmor permissions

If we change the base template to allow read/write access to /run/user/$uid with the expectation that a private version will be mounted over the top, we need to be careful about interfaces that add content to the directory. In particular, interfaces will want to explicitly deny write access to those directories.

What about `sudo` (or if XDG_RUNTIME_DIR is not set)?

When commands are run under sudo, no login session is started for the user so systemd does not create the user’s XDG_RUNTIME_DIR. I don’t think we can simply decide that absence of the directory should short circuit all of this handling though: it is possible that a login session for the user will be started after the snap has started.

If the snap’s AppArmor profile grants read/write access to the /run/user/$uid path (as it would have to under this proposal), then suddenly the snap has access to everything in the user’s real XDG_RUNTIME_DIR.

So I think we need to always create the mount, even if it doesn’t previously exist. We’d also need to be careful about what happens to our private directory when systemd starts or stops the user-runtime-dir@.service service.

Backward compatibility

Looking at the desktop-launch script created by the various desktop extensions, I see:

It only sets PULSE_SERVER if $XDG_RUNTIME_DIR/../pulse/native exists. With this change, that code path would not run and libpulse would look for the socket in the regular location (which now works).
Similarly, the Wayland support only tries to symlink the socket if $XDG_RUNTIME_DIR/../$WAYLAND_DISPLAY exists. That also becomes a no-op now.
The dconf fixup only runs if $XDG_RUNTIME_DIR/../dconf/user exists, so that’s also a no-op.

These all match the behaviour of the old snapcraft-desktop-helpers scripts, so I suspect the vast majority of snaps will jut work with the change.

I think this about covers it. How does this proposal sound to everyone else?

alan_g · January 20, 2021, 12:03pm

This sounds sensible. At least in theory.

In addition to the desktop-launch script, there are also snaps that serve the wayland interface, such as mir-kiosk and egmde and a lot of “kiosk” snaps use other setup logic (e.g. mircade). I’m not totally confident all these will “just work” but if you give me a ping when there’s something to test with I’ll work it out.

lucyllewy · January 21, 2021, 12:26am

I like the idea of simplifying the desktop scripts and making things appear in more expected locations. I’m wary of the idea of making /run/user/$uid read/write in apparmor, however. Could there be mileage in a half-way method of using the user-mount capability to expose the relevant sockets from their true locations (either another snap’s $XDG_RUNTIME_DIR or the system $XDG_RUNTIME_DIR depending on whether the plug is against $OTHER_SNAP_NAME:slot or system:slot) as proposed but making them appear at /run/user/$uid/snap.$SNAP_INSTANCE and still overriding the $XDG_RUNTIME_DIR to point at a snap-specific location?

jamesh · January 21, 2021, 2:45am

Looking at the scripts in the mir-kiosk snap, the proposed change would likely break it:

github.com

canonical/mir-kiosk/blob/65a519b5ef4c815338f5396d8467dcf616b0b2a5/glue/bin/run-miral#L3-L4


      
          export XDG_RUNTIME_DIR=$(dirname $XDG_RUNTIME_DIR)
          mkdir -p $XDG_RUNTIME_DIR -m 700

Altering XDG_RUNTIME_DIR like this would cause the server to try and create its socket as /run/user/wayland-0, which would fail. With that said, if the mir-kiosk snap was fixed to use the XDG_RUNTIME_DIR as is, any other snaps connecting to its wayland slot would work as is.

Is there a particular reason you’re wary? I see this the same as us making /tmp read/write via AppArmor. In both cases, it depends on the mount namespace being set up correctly, but that is a crucial part of our sandbox strategy anyway.

alan_g · January 21, 2021, 9:23am

The problem with just “fixing” the mir-kiosk scripts is the transition period during which there are versions of snapd with the old treatment of $XDG_RUNTIME_DIR. (There’s a long tail of distros including older versions of snapd.)

But AFAIK, I’m the only person maintaining snaps that provide a wayland socket so writing something that handles both treatments is a manageable problem.

lucyllewy · January 21, 2021, 7:57pm

On thinking over it a bit, and with your statement I’m quoting, I think I’m happy that it’ll be ok.

ogra · January 21, 2021, 8:55pm

what would happen to snaps that define their own XDG_RUNTIME_DIR via an environment stanza in the apps block ? would that still be respected ?

jamesh · January 22, 2021, 1:49am

The suggestion here is only to change (a) how snap-confine/snap-update-ns set up the sandbox mount namespace, and (b) how snap run configures the environment before calling snap-confine.

The environment stanza(s) in snap.yaml are handled by the snap-exec helper that runs within the sandbox, and would override what snap-confine sets. This proposed change would cause problems if a snap sets the variable based on its current value, assuming the old path though. For example:

environment:
  XDG_RUNTIME_DIR: ${XDG_RUNTIME_DIR}/..

I’m not sure how common the above would be, since it would generally be an uncommon path until something creates the private XDG_RUNTIME_DIR.

jamesh · January 22, 2021, 2:39am

Looking further, it seems the daemon mode of your mircade snap would break with this change, so it isn’t just the server side:

github.com

MirServer/mircade/blob/a07ddda495b199482bbc8f49f2c4d6a67afd7b6e/setup-scripts/bin/nested-start#L3-L5


# "Someone else" provides the real Wayland display, we use that as a host
real_wayland=$(dirname "$XDG_RUNTIME_DIR")/${WAYLAND_DISPLAY:-wayland-0}
if [ ! -O "${real_wayland}" ]; then echo "Waiting for Wayland socket";  sleep 4;  fi

Here it would never find the Wayland server because it only checks outside of XDG_RUNTIME_DIR rather than doing so as a fallback.

If the code in question was searching for /run/user/$(id -u)/$WAYLAND_DISPLAY, then things would likely work before and after this proposed change. I wonder how common this kind of thing is? Maybe we need a more thought out transition plan then.

ogra · January 22, 2021, 11:12am

i was more thinking of something like:

environment:
  XDG_RUNTIME_DIR: /tmp

which i bet isnt that uncommon …

jamesh · January 22, 2021, 11:34am

That should work the same as it does at the moment: just as it does now, the environment: stanzas are interpreted after the snap run code that currently sets XDG_RUNTIME_DIR. So this should continue to function as before.

I’ll note that setting it to /tmp also violates the Base Directory spec, in that the lifetime of the data in that directory can extend past the end of the user’s login session, and different users will see each others’ runtime files. Is there a particular reason why many snaps would be doing this, or is it just due to the existing spec non-compliant behaviour where we leave XDG_RUNTIME_DIR pointing at a non-existent directory?

ogra · January 22, 2021, 11:40am

It used to be a workaround at some point in time for 16.04 desktops to get tray icons shown (there was a bug in the indicator for a while, i think it got SRU’d eventually though)

jamesh · January 22, 2021, 11:53am

Thinking more about the compatibility problems @alan_g mentioned, I think we could achieve the benefits related to non-implicit slot implementations by simply leaving XDG_RUNTIME_DIR as is, but doing everything else in the proposal.

This mean that the snaps runtime dir would look like /run/user/NNN/snap.foo/snap.foo in the host file system, but it would mean that /run/user/NNN within the sandbox is properly isolated.

I think we still want to move over to using the standard runtime directory path, but perhaps that needs to happen as an opt-in feature. My first thought was to use the assumes: stanza, but that is currently only used to decide whether to allow installation of a snap rather than alter the behaviour of a snap.

Saviq · May 30, 2022, 8:06am

I’m going to revive this topic as we just got a bug report stemming from at least some of this, but also highlights a real problem users are facing now:

While SSH as root is obviously not recommended, doing so should not break things. Today, because systemd has no idea that snap services may use /run/user/0 (or whatever UID the daemon runs under), that location gets purged when the last session for that user is closed.

This is a result of:

May 30 08:02:14 nuc systemd[1]: user@0.service: Deactivated successfully.
May 30 08:02:14 nuc systemd[1]: Stopped User Manager for UID 0.
May 30 08:02:14 nuc systemd[1]: Stopping User Runtime Directory /run/user/0...
May 30 08:02:14 nuc systemd[1]: user-runtime-dir@0.service: Deactivated successfully.
May 30 08:02:14 nuc systemd[1]: Stopped User Runtime Directory /run/user/0.
May 30 08:02:14 nuc systemd[1]: Removed slice User Slice of UID 0.

Which suggests that to use /run/user/$UID we really need to tell systemd we’re a user session.

jamesh · May 30, 2022, 8:59am

I think what you are describing is somewhat different. From the XDG Base Directory specification:

$XDG_RUNTIME_DIR defines the base directory relative to which user-specific non-essential runtime files and other file objects (such as sockets, named pipes, …) should be stored. The directory MUST be owned by the user, and he MUST be the only one having read and write access to it. Its Unix access mode MUST be 0700.

The lifetime of the directory MUST be bound to the user being logged in. It MUST be created when the user first logs in and if the user fully logs out the directory MUST be removed. If the user logs in more than once he should get pointed to the same directory, and it is mandatory that the directory continues to exist from his first login to his last logout on the system, and not removed in between. Files in the directory MUST not survive reboot or a full logout/login cycle.

So it is expected behaviour for /run/user/0 to be cleared when the last login session for root closes, as happens when when the ssh session exits. If you are using the directory outside of a login session then all bets are off.

I suspect what you’re really after is “a way for an Ubuntu Core device to start a user login session on boot”. Together with the currently experimental user daemons feature, a snap would then be able to run code within that session (such as a display server and kiosk web browser). You’d get a predictably managed $XDG_RUNTIME_DIR and session bus as a bonus, rather than having to work around these missing session features.

Saviq · May 30, 2022, 9:17am

Oh sure. But it is snapd that sets XDG_RUNTIME_DIR to that location…

That is certainly a future I’m looking forward to .

jamesh · May 30, 2022, 9:45am

It really shouldn’t be doing that unconditionally: if the variable is not set, then it means the process is outside of a login session and consequently not bound to the lifetime of the directory.

I can also see that we’ve been doing this since November 2016 though:

github.com/canonical/snapd

dirs,interfaces,overlord,snap,snapenv,test: export per-snap XDG_RUNTIME_DIR per user (LP: #1620442)

canonical:master ← jdstrand:per-snap-xdg-runtime-dir

opened 11:09PM - 15 Nov 16 UTC

jdstrand

+64 -0

- interfaces: allow access to snap-specific XDG_RUNTIME_DIR - snapenv: export X…DG_RUNTIME_DIR as part of snapenv - dirs,overlord,snap: cleanup XdgRuntimeDir for all users on removal In support of Ubuntu Personal and xdg-aware snaps, this PR exports XDG_RUNTIME_DIR as a snap and user-specific directory. AppArmor rules are added to allow creating and using the directory. Note: the location of this directory follows the current convention of /run/user/$UID except that it creates a snap-specific subdirectory under it which follows the typical snappy file naming conventions. Eg, for a snap named 'foo', XDG_RUNTIME_DIR=/run/user/$UID/snap.foo. $UID is set to os.Geteuid() to ensure the proper uid is used when invoked with sudo. A separate PR for snap-confine will create this directory on behalf of the user. These PR can land independently of each other.

So changing it to only update the variable if it is already set would have compatibility problems of its own (your kiosk snaps, as a prime example). That said, the bug report you mention is evidence that the current behaviour doesn’t guarantee reliable behaviour.

Saviq · May 30, 2022, 6:45pm

All of this would be m00t if Wayland was sensible and actually supported socket locations outside of $XDG_RUNTIME_DIR…

jamesh · May 31, 2022, 8:42am

If you want to get this working without a user session, one option would be something like this:

In the display server snap, set XDG_RUNTIME_DIR=$SNAP_COMMON/run for your system daemon, so it creates the socket file in as /var/snap/$snap_name/common/run/wayland-0.
Add a content interface slot to the display server snap that shares $SNAP_COMMON/run.
In your kiosk app snap, add a matching content interface plug that causes that directory to be mounted to its $SNAP_COMMON/run directory.
In the system daemon for the kiosk app, set XDG_RUNTIME_DIR=$SNAP_COMMON/run.

That should isolate the system daemons from anything happening in /run/user. With that said I suspect you’re likely to run into more problems like this as you adapt desktop technologies to run on IoT devices, and it will eventually be easier to spin up a real user session.

I’m not even suggesting you necessarily switch everything to a non-privileged user: this could be a user session for root. If you have a user session, you’re more likely to be able to take advantage of any improvements we make for running desktop snaps on Classic distros or Ubuntu Core Desktop (when we get that to a point where people can use it).

alan_g · May 31, 2022, 9:48am

I agree that a “real user session” is what is needed. The question I have is: how is that best achieved?