Injecting snapd tools into base snaps and keeping them up-to-date

Problem statement

snapd requires some tools to be available inside the snap mount namespace. Those tools come from snapd itself. In the times of core snap, they were always a part of the base filesystem, along with the rest of snapd. Since core18 and other base snaps were introduced the situation got somewhat more complicated. At startup of the program, the mount namespace would contain a view of snapd tools from either: the base OS package (classic package), the core snap (snap package) or snapd snap (snap package), depending on various factors.

The problem is that as core and snapd snaps are refreshed, the view of the tools becomes stale. For example, when a snap application program invokes snapctl it will keep invoking the version that was available when the program was first started, even though snapd may have been updated and restarted since.

Solution A: explicit content connection to provider

The snapd tools provider, either snapd snap, the classic world, or the core snap, provides content slot with /usr/lib/snapd. All snaps have mandatory plug that is connected to it. The purpose of the plug is to retain correct semantics when core or snapd are updated.

The classic aspect is somewhat more tricky but I believe we could ignore it if we always require snapd or core to be present, then special-case no longer exists. If we cannot ignore it because of re-exec policy issue and we must use distribution assets then the snapd snap could provide a special variant of the slot that, when connected, uses the distribution directory instead.

Advantages

  • part of the system

Disadvantages

  • complexity on core/snapd refresh
  • bad UX (mandatory 1-to-every connection)
  • broken system when disconnected

Solution B: implicit content connection to provider

The same as solution A but without exposing it to the user in any way. The connection could be somehow faked internally so that existing APIs would not need to learn to special-case it.

Advantages

  • nicer UX (less connections)

Disadvantages

  • UX polish at cost of extra complexity
  • still more complexity on core refresh

Solution C: export mechanism from provider + static symlink farm

This is a bit of a longer idea but one I like the most. The central concept is that snaps can, via the interface system or otherwise, export content to the host. The location of the exported content is /var/lib/snapd/exports/snapd/.

The files inside depend on the nature of what is being exported. In our case that would be a directory with symlink farm, one for each of the objects in /usr/lib/snapd/, so snapctl, snap and a few others. All snaps would get an automatic mount entry that puts /var/lib/snapd/exports/{core,snapd} in /usr/lib/snapd – effectively giving all snaps a static view of a directory that has symlinks that can be changed without touching the mount namespace.

I like this idea the most because it feels least abusive of the interface system and opens up a possibility for many interesting ideas:

  • exposing man pages to a structure in /var/lib/snapd/exports/man
  • exposing nvidia binary driver .so files

The key idea is that classic world must be adapted to reach into this location so by merely doing it we are not breaking the world. We just gain a mechanism that can be used, step by step, to provide some content inside a snap, outside of it, in the structure it is expected.

Advantages

  • decoupled 1-to-N setup
  • export mechanism useful for other content (e.g. man pages)

Disadvantages

  • export mechanism prerequisite
1 Like

This seems fragile

This also seems fragile and unintuitive inside snapd because now you would have connections that are sometimes hidden from the user. Perhaps this already exists in other places but it seems too “magic” to me, and if it’s going to be hidden from the user anyways why implement it as a connection in the interfaces and not just go with the automatic mount entry like described in 3 anyways?

I’m a little lost on where the symlinks point such that it could be updated without a mount namespace change. Would /usr/lib/snapd/snapctl (which is really a view into /var/lib/snapd/exports/core/snapctl in the initial host mnt ns) point to somewhere in the hostfs like /var/lib/snapd/hostfs/var/lib/snapd/exports/core/snapctl ?

Also it seems like solution C could be implemented just for the core/snapd snaps without exposing the functionality via the interface system which would be easier and allow for the more general work for exposing other kinds of files from other snaps to happen later.

On the host if re-exec is used and tools come from snap:

  • /var/lib/snapd/exports/snapd/snapctl -> /snap/{core,snapd}/123/usr/lib/snapd/snapctl
  • … (similar for other files there)

On the host if re-exec is not used or no other provide snaps are installed:

  • /var/lib/snapd/exports/snapd/snapctl -> /usr/{libexec,lib,lib64,...}/snapd/snapctl
  • … (similar for other files there)

In the per-snap mount namespace:

  • bind mount /var/lib/snapd/exports/snapd over /usr/lib/snapd

We would only have to allow snaps to access /snap/{snapd,core}/*/usr/lib/snapd/* or /var/lib/snapd/hostfs/usr/{libexec,lib,lib64}/snapd/*

Can you expand on how that would work? I’m somewhat worried a solution involving a new C program that is similar to snap-update-ns would be NACKed.

When I said solution C I simply meant your Solution C here:

Not necessarily a new program written in C…

What I think this would look like is that initially we could decide on where the /var/lib/snapd/export directory lives and how it’s structured, then have snapd upon startup (maybe another service not sure) mount the files from the core/snapd snap that we want exported into that directory. Then at run-time of a particular snap, snap-confine would currently just always need to perform that mount from /var/lib/snapd/export onto /usr/lib/snapd in the snap’s mount namespace.

This would allow us to design what the host interface to accessing these snap exported files looks like without necessarily implementing the design for snaps other than core/snapd to actually export anything there yet.

Does that make sense?

Oh… :joy:

Yeah, I think this is close to what I had on my mind. I would only move this entirely out of snap-confine and into snap-update-ns, where we at least model what we did to the mount namespace.

Sure I don’t see a specific reason it needs to happen in snap-confine rather than snap-update-ns. I just referred to the former because it’s the one I’ve been studying more, but I will be moving onto snap-update-ns shortly :slight_smile:

1 Like

I’ve implemented a quick draft of the “C” proposal in https://github.com/snapcore/snapd/pull/8843

I’ve been working on a race free way to handle the C idea and I have the following proposal. I’m working on the draft implementation and making good progress.

Overview of the new idea

The goal is to provide up-to-date tools accessible inside per-snap mount
namespace. This set contains, snap-exec, snap-ctl, snap-update-ns,
snap-discard-ns and snap-confine.

Currently on startup, snap-confine mounts the current revision of the
versions from either the host, the core snap or the snapd snap into the
per-snap mount namespace in /usr/lib/snapd. As the snap revisions change the
tools become stale and out-of-date.

The new idea relies on those components:

  • manipulation of a directory mounted into all mount namespaces
  • a symbolic link pointing to the current provider of all the tools
  • a set of symlinks pointing from tool name, through the current
    provider symlink, to the actual binary
  • renameat2(2) with RENAME_EXCHANGE as an atomic way to swap the current
    symlink symlink - supported since Linux 3.15 and back-ported to 3.10 in
    CentOS 7 - our oldest supported kernel.

Example interaction: snapctl

Inside a per-snap mount namespace, /usr/bin/snapctl is a symbolic link
../lib/snapd/snapctl, as defined in each of the snaps core, core18 and
core20. Resolving it points to /usr/lib/snapd/snapctl which is a part of
the fixture mounted by snap-confine. The leaf component is another
symbolic link ../../var/lib/snapd/export/snapd/current/snapctl. Resolving
it again produces /var/lib/snapd/export/snapd/current/snapctl of which
current is the last symbolic link. Atomic manipulation of that symbolic
link allows atomic selection of the tool provider.

Example interaction: snap-confine set consistency

Snap confine opens several programs and executes them through through the
file descriptor and fexeve(3). By resolving the current symlink explicitly,
snap-confine can open a consistent set of tools, where snap-update-ns,
snap-discard-ns and snap-exec all come from the exact same revision, even
among concurrent modifications performed by snapd. There is a chance that a
symbolic link becomes dangling, but this can be detected and snap-confine
can re-evaluate current again, to obtain the new provider and try again.

Apparmor permissions

Existing permissions need to model content visible through the symbolic links. In practice permissions will allow access to:

/var/lib/snapd/export/snapd/**

There will be specific permissions for distinct tools as those participate in
some transitions. Those are not any different from the current
implementation, except for the new prefix that points to the export
directory.

Changes to classic packaging

Classic packaging statically ships the host tools export definition symbolic
links. This design allows snapd not to care about the particular packaging
policy of the host distribution, expecting only the host directory to
contain all of the tool symlinks. The symlinks need to be valid from the
perspective of the per-snap mount namespace.

/var/lib/snapd/export/snapd/host/snapctl -> /var/lib/snapd/hostfs/usr/lib{,exec}/snapctl

Changes to snapd

Snapd gains a skeleton of the export system, allowing snaps to export content
to the export directory /var/lib/snapd/export. In the skeleton
implementation only core and snapd snaps implicitly do this - there is no
new syntax to design.

Snapd maintains a set of files and directories in /var/lib/snapd/export
which contain symlinks to snapd tools, such as snap-update-ns and snap-exec.
The general structure looks like this:

  • /var/lib/snapd/export
    • snapd
      • core-NNN - tools provided by core snap revision NNN
        • snap-exec - symlink to /snap/core/NNN/usr/lib/snapd/snap-exec
          Symlink target is valid inside snap mount namespace and may be
          dangling on hosts that do not use the /snap directory.
        • …
      • snapd-NNN - tools provided by snapd snap revision NNN
        • snap-exec - symlink to /snap/snapd/NNN/usr/lib/snapd/snap-exec
        • …
      • host - tools provided by the host
        • snap-exec - symlink to ../hostfs/usr/lib{,exec}/snapd/snap-exec
        • …
      • current -> symbolic link to any peer directory here.

This structure is persistent and undergoes basic operations as core and snapd
snaps are installed or removed. None of those require specific
synchronization with other parts of the stack.

Tool selection mechanism and update process

Snapd maintains the current symlink by choosing the tool provider. Before
either snapd or core snaps are installed, current points to host. After
core is installed current points to the core-NNN directory, matching
the active revision of core. After snapd is installed current points to
the active revision of snapd. This election mechanism relies on the state
to know which effective value to provide.

The process of switching tools is as follows:

  • create a symbolic link temp pointing to the name of the
    next provider and store it in the export/snapd directory
  • exchange the current and next symbolic links with renameat2(2) and
    RENAME_EXCHANGE
  • remove the symbolic link temp

At no point during this process are the tools unavailable.

Changes to snap-confine

When starting a new process, snap-confine no longer mounts the current
revision of core or snapd tools under /usr/lib/snapd. Instead it mounts a
fixture that was prepared by snapd. This fixture is opaque to snap-confine,
significantly reducing moving parts in the more tricky and security sensitive
code.

The code that opens additional tools is amended in as follows:

  • tools are always picked from /var/lib/snapd/export/tools/current
  • tools are picked relative to the directory denoted by current
  • opening tools can fail with ENOENT, indicating that a link became
    dangling because the corresponding revision of the provider was unmounted.
    This now re-starts the tool opening process by re-evaluating current
    again. Note that only the leaf ENOENT is handled this way, specifically
    that which opens the actual tool.

I’ve implemented version C on https://github.com/snapcore/snapd/pull/9384