Can interfaces define per-user bind mounts?

jamesh · May 8, 2017, 9:27am

For an interface I’m writing, I need to create a bind mount under $XDG_RUNTIME_DIR. At present, I don’t see a way to implement this, since the Name and Dir members of the mount.Entry seem to be treated as plain strings with no chance to substitute any per-user data.

For my particular use case, I have the following constraints:

The source and destination of the bind mount are in user-specific directories under $XDG_RUNTIME_DIR
The source directory will not persist past the end of the user’s session(s), so the bind mounts shouldn’t either.

While this could still be handled by a single mount namespace shared by instances of the app run by different users, it would probably be non-trivial to manage: when the second user runs the program it’d need to add a new bind mount to the existing namespace, and when users exit the program the bind mount would need to be removed.

There also seems to be some other road blocks here. Since I couldn’t rely on any variable expansion, I tried hard coding the path /run/user/1000, which should work for the first user on the system. When I tried to run an app using the interface, I got the following error:

$ snap run file-roller
cannot perform operation: mount --bind -o ro,nosuid,nodev /run/user/1000/doc/by-app/snap.pkg.file-roller /run/user/1000/doc: Permission denied

This source path definitely exists outside the sandbox, so I’m not sure why there is a permission denied error.

zyga-snapd · May 8, 2017, 9:50am

At present interfaces cannot define per-user mounts.

zyga-snapd · May 8, 2017, 9:51am

Because snap-confine is not allowed to perform such bind mounts. snap-confine is running under special confinement itself so that it cannot be easily used as an attack vector (since it is setuid-root).

Can you tell me more about why you need the bind mount?

jamesh · May 8, 2017, 10:01am

I want to make the “documents portal” from xdg-desktop-portal available to a confined application.

The documents portal is a FUSE file system that is usually mounted at $XDG_RUNTIME_DIR/doc. It exposes a sub-directory by-app/$application_id that presents a filtered view of only the registered documents that the application has been granted access to.

The idea is that the by-app directory is bind mounted to $XDG_RUNTIME_DIR/doc inside the sandbox so that the same document paths are valid inside and outside of confinement, provided the application has access to a document.

This is used as a building block for things like a trusted file picker: the application asks to open a file, and the out-of-process picker grants the application access to the file the user chooses and returns a document portal path.

zyga-snapd · May 8, 2017, 10:03am

But since /run is the same as it is on the host distribution this should already work OK. The only problem, I suspect, is the value of $XDG_RUNTIME_DIR inside the snap.

As long as the portal can do its own bind mounts and other thing things will just work (we may also need an interface to allow well-known portal locations)

EDIT: There’s another thread on the forum that explores portals, you may want to see that.

jamesh · May 8, 2017, 10:11am

For the document portal’s security to work, I really do need the bind mount. Exposing the root of the document portal file system to the sandboxed application would give it full access to every registered document. The bind mount is there to enforce the permissions.

zyga-snapd · May 8, 2017, 10:41am

There some ongoing work to implement per-session snapd that could perhaps just implement the portal logic.

morphis · May 8, 2017, 11:48am

That is what I would prefer too. See https://github.com/snapcore/snapd/pull/3260 and Integrate snapd-xdg-open into snapd repository for a few more details.

jamesh · May 8, 2017, 12:03pm

While re-implementing the URL opener portion of the portal inside snapd is not too difficult, I don’t think that is a viable option for things like the document portal and file picker. Getting the security control and other semantics correct is likely to be a lot more difficult than you might first think.

Also, a partial reimplementation of the portal is probably going to cause more problems than it solves. If you tell GTK to use the portal (either by patching GTK, or just setting the GTK_USE_PORTAL environment variable), it won’t just change how it tries to open URLs: it will also change the file choosers, network monitoring code, printing support, etc.

I think the most sensible way forward is to teach xdg-desktop-portal how to tell when it is talking to snap confined apps and just let it do its thing.

morphis · May 8, 2017, 12:18pm

I am leaving that for the relevant people to comment but I think the crucial point here will be to marry those two different security systems.

zyga-snapd · May 8, 2017, 12:24pm

I’m pretty sure we want user session snapd to assist in this. GTK itself should just do what it does internally but the provide of that thing, which must be a stable protocol, might be either snapd or the adapted portal code. I’m sure we can figure out a way to make it work nicely and reliably over time.

jdstrand · May 8, 2017, 1:17pm

Just as a quick note, currently XDG_RUNTIME_DIR is set to /run/user/<uid>/snap.$SNAP_NAME. This is going to have to change to accommodate wayland (see Wayland, dconf and XDG_RUNTIME_DIR).

jamesh · May 11, 2017, 10:50am

I had shot at updating the snap-confine AppArmor policy, adding the following rules:

# support for xdg-desktop-portal documents portal
/run/user/[0-9]/** r,
mount options=(rw bind) /run/user/[0-9]*/doc/by-app/* -> /run/user/[0-9]*/doc,

But I’m still having trouble getting this to work. The weird thing is that I’m not seeing any denial error in syslog. Looking at the strace output, I just get a permission denied error from the syscall:

mount("/run/user/1000/doc/by-app/snap.pkg.file-roller", "/run/user/1000/doc", NULL, MS_NOSUID|MS_NODEV|MS_BIND, NULL) = -1 EACCES (Permission denied)

Is there anything obvious that I’m doing wrong here?

zyga-snapd · May 11, 2017, 10:52am

Are you really not seeing any apparmor denials? Other than that I don’t know what could be blocking this.

jamesh · May 11, 2017, 10:57am

I’ll give it a try on a second machine just to make sure there isn’t something weird in my build environment.

jamesh · May 11, 2017, 11:39am

So I’m seeing the same behaviour on the second system. They’re both running Zesty, if that makes a difference. Putting the policy into complain mode doesn’t make a difference either.

I wonder if the EACCES failure is being triggered before the AppArmor checks get a chance to look at the syscall?

jdstrand · May 11, 2017, 1:53pm

FYI, DAC is checked before LSM, so certainly possible. The fact that it doesn’t work in complain mode is a strong indicator a DAC check (uid/gid, capability, traditional permissions, acls, etc) is the issue.

jamesh · May 12, 2017, 7:01am

Okay, I’ve tracked down what is going on here: the path I’m trying to bind mount is within a FUSE file system. By default, FUSE restricts access to a file system to the user that mounted it, unless either of the allow_other or allow_root options are used. Neither of those options are specified by xdg-document-portal, so I get the permission error when snap-confine tries to set up the mount name space as root.

This is working with Flatpak, so I guess I’ll check what they’re doing differently.

jamesh · May 12, 2017, 10:22am

Okay, so I think I know what is going on now: bubblewrap (flatpak’s equivalent to snap-confine) switches back to the user’s uid while preserving some root capabilities:

https://github.com/projectatomic/bubblewrap/blob/master/bubblewrap.c#L580-L600

So this allows them to call mount() as a normal user, which in turn allows that mount call to reference paths within the FUSE file system. For added security, it implements privilege separation by forking and dropping the capabilities in one of the processes.

This is looking a lot more complicated than a simple afternoon hack

jdstrand · May 12, 2017, 12:01pm

I’ve not thought it all through, but allow_root may be acceptable here. For snaps, we could have ‘owner’ match for these paths if we wanted (we already have owner /run/user/[0-9]*/snap.@{SNAP_NAME}/...). For non-snaps, not having ‘allow_root’ as a security mechanism on the surface seems specious since if you are root you can replace libfuse and make it not care about allow_root. From the link you gave:

libfuse-specific mount options:
       These following options are not actually passed to the kernel but
       interpreted by libfuse. They can be specified for all filesystems
       that use libfuse:

       allow_root
              This option is similar to allow_other but file access is
              limited to the filesystem owner and root.  This option and
              allow_other are mutually exclusive.

(note, ‘allow_root’ is not passed to the kernel and is only enforced by libfuse, therefore it can easily be subverted).