Snapcraft Summit 2019 (Montreal) - a snapd perspective


#1

Snapcraft Summit 2019 (Montreal)

Observations from the perspective of a snapd developer.

Migration from /var/lib/snapd/snap to /snap

Manjaro considered to migrate from the alternate snap mount directory to the
primary one. The motivation was to shrink the delta and lessen the friction of
adopting new upstream features. During the event we prototyped a tool that can
migrate people in place, that is, without rebooting.

The tool stops snapd along with all snap systemd services, timers, sockets and
mount units. The /snap symlink, if one is present, is removed. The entire tree
at /var/lib/snapd/snap is moved to /snap. Various files are rewritten to take
account of the new location: all systemd unit files (replacing both path and
systemd style paths), generated desktop files, snapd state file.

A compatibility symlink is placed in /var/lib/snapd/snap so that existing
entries on PATH work fine. The system key is removed to ensure new security
setup is performed, systemd is reloaded and then all unit files are carefully
stated, starting with all mount units, following with all other unit types.

The tool worked but we considered it too much of a hack to deliver. It would be
best if snapd would learn how to perform this type of migration by itself,
properly taking into account everything that needs to be adjusted. Ideally this
tool would run at early boot and would consume a distribution policy file that
indicates if the snap directory is in the primary or alternate location. This
would even allow a distribution to move back and forth between those.

Documentation of squashfs file format

The Godot engine developer was interested in providing an integrated “one
click” publishing to the snap format. We did use the existing documentation
about the meta/snap.yaml file but then hit a small roadblock, having realized
that mksquashfs is licensed under GPLv2. Godot has strict requirements on
tooling it can ship and link to, specifically all of Godot is linked into a
single executable under permissive license.

To solve the issue the developer looked at the public documentation for
squashfs, as well as at the kernel data types that describe each structure used
by the format and proceeded to re-implement the compressor from scratch, under
a permissive license, as a library.

One positive outcome of this endeavour is the availability of another
implementation as well as, most importantly, documentation of the file format.
We could consider adopting that documentation for cases where anyone needs to
emit a valid squashfs and cannot use the reference implementation.

Confinement trouble

A number of people used non-Ubuntu kernel at the event. Those machines ranged
from distribution specific kernel (e.g. 3.10 in CentOS) to mainline kernel
builds on Ubuntu. There are number of people that use a mainline kernel on
Ubuntu or Debian simply because it gives them access to better hardware support
on their particular device.

In all such cases apparmor is disabled, rendering a good fragment of the
sandbox ineffective. None of the developers realized this and ended up creating
snaps that lacked interface definitions that would make them operate on a
system with strict confinement.

I think that we should enhance the visibility of partial confinement in the
following manner. We’d start by adding a new command that gives plain text
explanation of the consequence of absence of certain type of confinement
technology on a given system. In practice I foresee around three modes: 1)
strict 2) partial due to certain patches missing 3) partial due to apparmor
entirely missing. We can relatively easily detect and explain both 2 and 3 in a
way that would be useful for both users and snap developers. The command could
be “snap debug confinement --explain” or something appropriate.

The second part would require an user to acknowledge the presence of partial
confinement before installing the first snap. The wording should be crafted to
point to the new command, which explains the consequences, as well as to not be
entirely off-putting. In the case of 2) we may offer advice to use a 5.3 based
kernel. In the case of 3) we could link to a document that allows one to setup
LSM stacking once enough of the patches are merged and the tooling trickles
downstream.

Base snap for Nix

Nix is a special programming language, a package manager and as distribution,
also known as NixOS. The design of Nix is seems to be influenced by functional
programming, in the sense that it allows to create a self-contained binary (or
set of binaries) that don’t link or depend on things outside of the set. This
goes down to the level of setting the right dynamic linker, to using
content-based addressing and configuring everything perfectly at build time.

Nix is interesting in how it approached using snaps. Any set of Nix source
packages can be used to crate a snap that exposes said packages. The packages
are built to live in a sub-directory of /nix, typically
/nix/some-name/some-hash/… (this has nice similarity with the snap file
system hierarchy). Since all of the binaries under /nix/some-name/some-hash/
are self sufficient and don’t require any facilities from the base snap we have
decided to create a new base snap for all nix application snaps. The base snap
ships the standard set of empty mount points for integrating with Linux and
snapd. This covers things like /proc and /var/lib/snapd. We also created a new,
custom, top level directory: /nix. This idea allows us the take the entire
application snap and use layouts to put $SNAP at /nix. The same base snap can
now be used to build any nix package as a snap, all with correct linking, data
access and everything else.

This, ironically, was easier and more robust for shipping a quick demo of
Firefox as a snap, than our prior attempts. In my eyes this is mainly because
NixOS got “relocation” done first, at package build time. Unlike in how
snapcraft builds application snaps, where some components come from pre-built
packages that expect to find data spread all across the filesystem.

The base snap is also arch: all, since is ships no executable code, in fact it
ships no files other than /meta/snap.yaml. Apart from a small bug in snapd that
forced the Godot base snap to ship /bin/bash, the same is true for the other
base snap.

If you squint your eyes a little this simply shows that some snaps seem more
like standalone containers than what we were building traditionally (a where
applications were an extension to a base). Perhaps it would be more natural to
allow some snaps to be self-based, where they indeed ship the root filesystem
skeleton and only they can use it?

EDIT: in the last paragraph I was hinting at a solution where the entire application snap is the root filesystem, so that there is no extra complexity in snapping random software. In this idea base: self or something like this would indicate that the root filesystem is the snap itself, not some separate base snap.

Base snap for Godot

Godot is an game development environment and a game executable engine. Game
engine runs on lots of diverse environments, while the editor works on Windows,
MacOS and Linux.

A relatively unique aspect of games is that, more so than other types of
software, there is a well defined moment when a game is “done” and won’t be
developed further. Games made in the last three decades are still played, while
some of them have had re-makes and re-editions that makes it easier to play on
modern operating systems and hardware, for the most part we rely on binary
compatibility with the current operating system (Windows) or a special
environment provider (DosBOX) to run unmodified binaries built all those years
ago.

This presents an unique challenge for Godot: when a game is exported today,
what is the base snap it should work against? Will that base snap be around in
10 years? Will it support the hardware that people have in 10 years? To ensure
that the lifecycle of a game is not attached to the lifecycle of a given Ubuntu
release Godot chose to create a new base snap. The base snap is again,
virtually empty, apart from the mount points and (due to a bug in snapd)
/bin/bash.

The base snap is then combined with a given application snap that ships two
files: the game engine (either stock or customised) that the game was developed
with and tested against and a single compressed game assets file. This is
almost everything, except for hardware access support. Godot is highly
optimized and provides a trimmed down version of pulseaudio, mesa and various
input libraries necessary for the game engine to operate. All those files are
bundled into a Godot “runtime” snap and connected to the game using a versioned
content interface. The runtime has an appropriate snap declaration assertion so
that game publishers connect to it across publisher boundary.

The long term support and evolution of the runtime snap is the responsibility
of the Godot developers but this is something they already do anyway. Their
stack provides graphics, sound, input and networking in one tested package.
This mode fits very cleanly into a snap so this choice was natural. It also
decouples all games from the Ubuntu support cycle so that people can enjoy them
for years to come.

We discussed an option of using the upcoming hardware support snaps along with
the Godot runtime snap but decided to cross that bridge once we get there.

Populating the cgroup on app startup

While working on the new base snaps we hit a small bump where unless the base
snap shipped /bin/bash, an application snap using the opengl interface was
unable to start. This is caused by the fact the program “snap-device-helper” is
a shell script that is used in two separate ways: 1) udev uses it to add or
remove devices to existing cgroups as devices are added or removed from the
system. 2) snap-confine forks and exec’s it for each device udev has associated
with the snap being started.

That second use is extremely silly since all that the shell script does is
writes a single line to a file. We could easily do this from snap-confine since
we have all the data anyway. This would allow to use the device group from
“bare”-like base snaps and would also speed up the startup of all snap
programs. This code path is always executed for every snap application.

We could simply write to the cgroup directory ourselves, this will be also
necessary in the cgroup v2 model where that component will work in a totally
different way. We will then need to modify an in-kernel BPF map, so the shell
script will need to be rewritten in C or, C-flavoured-Go.

Assertions? Accounts? Help!

The assertion system in snapd is, probably, least understood component of the
stack. There are no hits in the documentation index or even the full text
search that gives a good quick overview of the assertion system, the list of
assertion types along with their purpose. I think we should strive towards
something similar to the snap interface documentation.

To a somewhat lesser degree we should have documentation on how to setup an
account key. Before the conference the Advocacy team has crafted as “bootstrap”
document that helps to get someone to the point where they can try to create
and publish snaps. I think that document should be refined and integrated into
the main documentation flow.

EDIT: I forgot about tree more pieces of information I wanted to share. Those are added below.

Bash functions

Snaps are somewhat challenging to use with software that aims to provide
bash integration in the form of arbitrary bash functions. The example here was Anaconda, crafting
“workspaces” where additional software is installed and made available. Think about this like “pip on steroids”, where you can install a set of assorted packages that are necessary for a given project. The main challenge is how to think about confinement in such systems. At some point it is essentially an unconfined program running in the users’s shell, offering access to another set of programs that it procured via some other mechanism. We discussed this for some time and after considering possible alternatives decided to go ahead with a snap using classic confinement.

Locale support

Getting locale to work is still challenging because the base snaps don’t provide any, forcing application developers to bundle and integrate a lot of boilerplate to get basic C-level locale and i18n features to work. I started work on a prototype core18-locales snap which might simplify this and offload the bulk of the data. The smoke test is to ship GNU Hello with working translations, without having to be a libc engineer.

OpenGL support

While working on Godot snaps we exchanged ideas and experiences. There are two bits of information that are relevant to share: Godot is a good source of knowledge about how switchable graphics can be detected and how it works. The effort to snap nvidia and mesa drivers will surely benefit from this. The other bit is that AMD graphics, while working very well in snaps, needs a particular text file in /usr/share/, or it will bail out and ignore the hardware it has access to. This is is also something to keep in mind while working on better generic mesa support. The file can be seen in Godot engine snap as well as in strace logs when executed on AMD hardware.


#2

This is where the idea of base: none comes into play, does it not? We discussed this a few months ago, I documented in http://blog.sergiusens.org/posts/broader-use-of-bases-for-snaps/ … this of course requires work in snapd, but will also enable us to build snaps (kernel, gadget) that may rely on no base on a given environment through snapcraft.


#3

I wasn’t aware that base: none implied a snap is its own base. I simply assumed that would not needlessly pull a base snap it doesn’t use anyway, because it contains no executable code. Did we agree that base: none is a self-hosted snap that can still have apps and hooks?


#4

What about session support in Snap?

This is needed for applications such as KDE Plasma, Egmde.


#5

I believe this is being implemented by @jamesh - you can find more information on the forum: A "user session agent" for snapd


#6

Wait, base: none was supposed to be like base: core18, but dynamically generated in snapd… a snap of type: app using base: none can declare apps and such and would of course rely on everything inside the snap as you would do with any snap if you didn’t want to depend on anything from the base.

As far as I know, @pedronis or @mvo mentioned that this feature of base: none was a few refactors away.


#7

I had understood base: none to be stripped out by snapcraft, reaching snapd as a no-base snap (i.e. base: core, except core isn’t a base).


#8

No, I about desktop environment sessions:


#9

@chipaca if we strip out the base: none that is essentially relying on core.
I feel confident that I at least asked @mvo to confirm my notes reflected the conversation (which he did).


#10

#11

This topic was not brought up. I think it is interesting but presents a separate unique set of challenges and it is unlikely to be available in the next few months. If there is interest in shipping GNOME or KDE “as a snap” where the entire shell is running from the snap I think this should be discussed in a separate topic, simply because of how large and complex that topic is.


#12

We actually have implemented the basics of it, which means we don’t pull in anything. So it’s useful sort of for kernels and for bases themselves. We haven’t implemented having it mean that the snap is its own base for “app” snaps. ATM a snap that declares base: none should have no apps or hooks.


#13

One thing that would be useful for base: none type snaps (although maybe not that base as it seems that doesn’t support having apps, but some type of mechanism to specify that a snap is it’s own base) which provide their own root filesystem would be the ability to have snapd chroot (maybe pivot_root but that leads to AppArmor problems I think) with some basic mounting support into a root filesystem that lives inside of the snap.

This would make it extremely trivial to port a docker container application to a snap, by simply shipping the root filesystem of the docker container inside the snap, and having snapd (probably snap-confine? I don’t recall which actual binary sets up the mount namespace and such) enter into that filesystem such that none of the AppArmor rules restricting access to files from the base snap affect accessing files from the snap.

Of course it does get complicated then how one runs apps and such from this type of specially created root filesystem.


#14

IIRC we had a PR from @jamesh that replaced the shell script with proper Go code. Maybe we could revisit that?

Hopefully the amount of C code will be close to nil.


#15

I submitted a PR converting the xdg-open shell script to Go code in the past. I haven’t ever touched the udev code paths.


#16

We have discussed that once the bug is taken care of, we should have an official virtually empty base (maybe “bare” itself) that is officially supported and can take care of all these similar cases of completely self-contained snaps.


#17

The PR was from me and prompted for different reasons. @zyga and I discussed this at the summit and the problem today isn’t so much that it is a shell script as it is that snap-confine is calling it rather than doing the work itself (the separate executable is still needed for hotplug of course). The future cgroup v2 work will require a rewrite to support hotplug with the equivalent of a device cgroup, and I suspect that this rewrite will be a small, simple C program that shares nearly all of its code with snap-confine (since snap-confine would also need to do the same thing for app startup).


#18

For godot we should be able to make it work. For Nix we need the special base snap as we cannot use layouts to create new top level directories that don’t exist.


#19

Meanwhile a simple move of a function call a few lines above what we currently do will fix it, by moving the shell script calls from before after pivot to before pivot.