Wayland, dconf and XDG_RUNTIME_DIR

It was previously agreed to that snappy should set XDG_RUNTIME_DIR to /run/user/`id -u`/snap.$SNAP_NAME and create the directory[1][2]. Today, snappy sets XDG_RUNTIME_DIR but does not yet create the directory (the directory seems to be created by gtk3 apps at least).

Wayland puts its socket into XDG_RUNTIME_DIR, but the server uses XDG_RUNTIME_DIR=/run/user/`id -u` so the socket is in /run/user/`id -u`/wayland-0. Clients start and are unable to connect because /run/user/`id -u`/snap.$SNAP_NAME/wayland-0 doesn’t exist and they fallback to mir and X.

In retrospect, upstream GNOME feels that the current situation is not correct and that snapd should “use mount namespaces to set up a clean copy of the xdg_runtime_dir at the same path, with the subdir bindmounted through”. Eg:

  • create /run/user/`id -u` as a clean new tmpfs inside the snap and create a snap specific subdirectory (ie, snap.$SNAP_NAME)
  • share /run/user/`id -u`/snap.$SNAP_NAME, such that it is at the same location inside and outside the snap
  • bind mount /run/user/`id -u`/wayland-0 from the outside into the snap

This is possible, but means we have to allocate RAM for each tmp area and it means that the launcher needs to find wayland-0, which adds some complexity when it doesn’t exist yet, when WAYLAND_DISPLAY[4] is set to something other than ‘wayland-0’ or if there is more than one ‘wayland-N’ in the dir. Upstream gtk hackfest discussed the possibility of wayland adding per-client sockets, which is something that is being looked at with dbus.

While the details are being worked out, the gtk3 desktop part will do something along these lines:
cd $XDG_RUNTIME_DIR && test -S $XDG_RUNTIME_DIR/…/wayland-0 && ln -s $XDG_RUNTIME_DIR/…/wayland-0 .

  1. https://bugs.launchpad.net/snap-confine/+bug/1620442/comments/2
  2. https://bugs.launchpad.net/snap-confine/+bug/1620442/comments/3
  3. https://github.com/flatpak/flatpak/blob/master/common/flatpak-run.c#L1988
  4. https://wayland.freedesktop.org/docs/html/ch04.html

If looking into this:

  1. sudo snap install ghex-udt
  2. update /var/lib/snapd/apparmor/profiles/snap.ghex-udt.ghex to comment out the ‘#include <abstractions/X>’ line (not strictly required, but ensures X cannot be fallen back to) and use sudo apparmor_parser -r /var/lib/snapd/apparmor/profiles/snap.ghex-udt.ghex
  3. snap run --shell ghex-udt.ghex
  4. cd $XDG_RUNTIME_DIR && test -S $XDG_RUNTIME_DIR/../wayland-0 && ln -s $XDG_RUNTIME_DIR/../wayland-0
  5. $SNAP/command-ghex.wrapper

Is there a reason for not bind-mounting the entire /run/user/$(id -u) into the snap? Excluding the snap-specific subdir, that directory is already protected by appamor, right?

FYI, I looked into this a bit more and with the symlink trick we can make the client find the wayland server (and also the systemd user session bus at /run/user/<uid>/bus), but part of the protocol is for the client to create another socket for wayland to use. Because XDG_RUNTIME_DIR is set to a snap-specific directory, that socket is put in /run/user/<uid>/snap.$SNAP_NAME when it should be alongside the wayland server socket. The filename for this socket is unpredictable (eg, weston-shared-XXXXXX) so the symlink trick won’t work.

In short, there seem to be 3 things that need to be in the same XDG_RUNTIME_DIR:

  • the wayland server socket (eg, /run/user/1000/wayland-0)
  • the systemd user session bus (ie, /run/user/1000/bus)
  • the client socket for wayland (eg, /run/user/1000/weston-shared-XXXXXX)

we run into problems if some are in the normal XDG_RUNTIME_DIR and some in the snap-specific XDG_RUNTIME_DIR/snap.$SNAP_NAME.

@niemeyer - I’d like to better understand your comment. If we bind mount the entire /run/user/$(id -u) into the snap and relying on apparmor for the access control, that isn’t really different (from a security perspective) from just using apparmor with no bind mounts and not setting XDG_RUNTIME_DIR to something snap specific. There are limitations to this approach-- we want snap-specific dconf databases and we’d really like the client sockets (weston-shared-XXXXXX) to be isolated (the apparmor rule atm must be weston-shared-*).

The approach outlined by upstream dconf works for these cases, but it also means we’ll want to bind mount a bunch of other stuff (eg, pulse?, perhaps gnome-shell?, etc). What else needs to be bind mounted needs investigation (perhaps the Desktop team can help?).

I should mention that we could probably get away with the apparmor glob because the weston-shared-XXXXXX sockets are not persisent (they are opened then unlinked).

With that in mind, the outlier then is per-snap dconf. If dconf could be made to use a snap-specific location in some manner, then we could side step much of this. This is somewhat discussed here:

Perhaps we simply treat dconf special:

  • XDG_RUNTIME_DIR is set to /run/user/$(id -u) (ie, unchanged by snapd)
  • security policy mediates accesses to XDG_RUNTIME_DIR like always
  • a per-snap dconf directory is created and bind mounted onto XDG_RUNTIME_DIR/dconf (ie, mimic what flatpak is doing)

The dir that we bind mount for dconf could maybe even be in the snap-specific temporary directory. We already have code for that, so simply mkdir in there then add a bind mount call to mount that on XDG_RUNTIME_DIR/dconf. In this manner, commands within the same snap can share this dir.

Note that this would be an isolation gap because while wayland couldn’t be attacked, coordinating snaps could share data via paths that matched this glob. We’d have to decide if that was acceptable in the short term (and message this in the interface/docs). Considering that upstream GNOME/wayland is considering changing how they do things, perhaps it is fine have this transitional (but safe from attack) policy until it can be better mediated.

(Note: we can make the transitional policy rule sufficiently unpleasant to use over the content interface so there is no practical benefit for coordinating snaps to abuse the weston-shared-* rule over just using the content interface).

I don’t think I understand the problem. It looks like we already have /run/user/<uid> available inside the snap, and we already allow snap.<name> to be used inside that place. If it’s just a matter of allowing wayland to see its own socket inside it as part of its interface, why aren’t we doing that?

The problem is wayland hardcodes /run/user/<uid>/ as the dir. snapd sets XDG_RUNTIME_DIR to /run/user/<uid>/snap.<name>/. Libraries honoring XDG_RUNTIME_DIR therefore look in /run/user/<uid>/snap.<name>/ for wayland stuff when it is not there. We need to rethink setting XDG_RUNTIME_DIR-- we were only ever setting it for dconf/gsettings because they requested it as part of the LSM (eg, apparmor)/gsettings fine-grained mediation work, but they have since decided they want fine-grained mediation handled differently, so we should work with them on that and revert snapd’s setting of XDG_RUNTIME_DIR.

This revert and the wayland interface is planned and prioritized, but behind commercial engineering work atm. It will be picked up soon.

FYI, the aforementioned commercial engineering work is completed and I’m picking this up again now. I’m going to for now set XDG_RUNTIME_DIR to /run/user/uid (like it was before) so wayland can work (dconf will continue to work via the gsettings interface). We’ll continue to allows snaps to use /run/user/uid/snap.name. I’ll propose a PR and we can get into details there.

1 Like

I took another look at this and am no longer seeing an issue with the wayland client socket if the wayland socket is a symlink from /run/user/<uid>/snap.foo/wayland-0 to /run/user/<uid>/wayland-0 or if /run/user/<uid>/wayland-0 is bind mounted into /run/user/<uid>/snap.foo/wayland-0. I have seen that 16.04 and 17.04 got refreshes of the GNOME stack so perhaps there were bug fixes which is why I’m no longer seeing the issues from before.

For the moment, I am going to create a very simple wayland interface that simply allows access to /run/user/<uid>/wayland-[0-9]*. I’ll also send up something so desktop-launch can set the necessary variables (though, the desktop team might want to do something different). I have tested this with:

  • an updated gnome-sudoko with gnome-3-24 platform snap that uses a symlink from $XDG_RUNTIME_DIR/wayland-0 to $XDG_RUNTIME_DIR/../wayland-0 on 17.10
  • an updated gnome-sudoko with gnome-3-24 platform snap that uses a symlink from $XDG_RUNTIME_DIR/wayland-0 to $XDG_RUNTIME_DIR/../wayland-0 on 17.04
  • an updated gnome-logs-udt snap that uses a symlink from $XDG_RUNTIME_DIR/wayland-0 to $XDG_RUNTIME_DIR/../wayland-0 on 16.04

In all cases, if the wayland socket was detected, I’d set:

export WAYLAND_DEBUG=1
export GDK_BACKEND="wayland"
export CLUTTER_BACKEND="wayland"

to force the use of wayland and would observe that wayland was used with no wayland server or client socket denials.

@jamesh, for now, I am going to pursue this simple interface to unblock people from playing with wayland since the interface accesses will be compatible with anything we do in the future with this interface. In the future I think we may want to consider moving wayland, dconf, pulseaudio and /run/user/<uid>/bus (the session bus) into a per-user mount namespace on top of the snap namespace like you are looking at for the portals work since that is similar to what upstream portals, wayland and dconf have been designed for. By doing that I suspect we’ll have fewer issues and can more closely follow upstream changes.

Okay. I think I’ve worked out a sequence of syscalls that should let me get the per-user mount namespaces working in a way that keeps @zyga-snapd happy (it essentially boils down to unshare mount ns, recursively set mounts to slave mode, add user mounts). It should be fairly easy to extend that to bind mount the wayland socket if that’s what we decide is best.

I wonder if the right solution in the long run will be to leave $XDG_RUNTIME_DIR with its usual /run/user/NNNN value, but effectively set up a private version at that location. Effectively doing the equivalent of the following in the process’s private mount namespace:

mkdir /run/user/NNNN/snap.$snap_name
mount --bind /run/user/NNNN/bus /run/user/NNNN/snap.$snap_name/bus
mount --bind /run/user/NNNN/wayland-0 /run/user/NNNN/snap.$snap_name/wayland-0
# and the same for other special sockets (e.g. pulse audio)
mount --rbind /run/user/NNNN/snap.$snap-name /run/user/NNNN

This would let the app create arbitrary files in under $XDG_RUNTIME_DIR, while not interfering with other apps. These steps would also ensure that two instances of the app would see the same content.

1 Like

That is precisely what I was thinking (and what I tested over the weekend as working) and it aligns with how upstreams (wayland, dconf, portals, etc) are treating the directory and why I think we might save ourselves some maintenance burden by going there. Interface connections end up manipulating the user part of the fstab file-- eg, if you connect wayland you get wayland socket bind mounted, if not, you don’t.

I’m seeing XDG_RUNTIME_DIR being set as /run/user/NNNN/snap.$snap_name, but the directory does not exist. Known issue?

Yes, your snap needs to create it itself or you can use the desktop part, which will do it for you as well as setup compatibility symlinks. See https://github.com/ubuntu/snapcraft-desktop-helpers/blob/master/common/desktop-exports#L95

Once session user mounts are supported, we can hopefully set XDG_RUNTIME_DIR to /run/user/uid as mentioned in Wayland, dconf and XDG_RUNTIME_DIR and then not have to worry about the mkdir or symlinks.

We just had a call to sync up on the progress of this work. Here are the details:

Just checking in on this old bug. The previous comment talks describes the intention to fix, but does not say if the eventual fix landed.

tl;dr Do applications distributed by Snap still need to assume that $XDG_RUNTIME_DIR does not exist and create the directory if needed? i.e. can I safely remove this Snap-specific workaround: https://github.com/twpayne/chezmoi/blob/v2.23.0/pkg/cmd/config.go#L1795-L1804

1 Like

How many years is left for this?