[RFC] Tailored presentation of snap data

zyga-snapd · April 28, 2020, 5:45pm

A while ago we discussed an idea about presenting snap data in a way better than today.

We identified some shortcomings of the current model:

The folder ~/snap/$SNAP_NAME/ presents data for various revisions as well as the English words current and common. Revisions are technical and users will typically not understand their significance. The two English words cannot be localized in any sensible way.
The folder ~/snap/$SNAP_NAME/current/ is a miniature home directory, with hidden dot-folders and dot-files. In a typical case, following the path of least resistance, actual snap application data is inside one of the standard dot-directories like .config, .local/share or .cache. A non-trivial number of applications also use custom dot-files that don’t follow the scheme outlined above.
Browsing using typical file managers is more challenging as they do not show hidden files or folders by default.

Those factors have largely contributed to the idea that the ~/snap is not very user friendly as finding the right data requires overcoming some usability challenges.

I’d like to propose changing how we present snap data in a dramatic way. There are two parts of this idea. The first part is that snap developers can now have a say in how their application data is presented to the user which is distinct to how the actual data is stored in the filesystem. The second part is technical and outlines various aspects of the proposed implementation.

Conceptually the snap packaging format leaves a lot of flexibility and decision making in the hands of the packager. The central interaction point is the meta/snap.yaml file, which describes how various applications and content from the snap is to be presented to the system. This proposal extends that idea to user data. Using a new section in the snap.yaml file, packages would be able to express meaningful presentation, or mapping, of the internal filesystem hierarchy in terms that users find comprehensible.

Let’s work through an example game. As a classic package the game would store it’s content in ~/.gamename. Most of the interesting files there being ~/.gamename/saves and ~/.gamename/maps. Packaged as a snap this becomes ~/snap/gamename/current/.gamename/saves and a corresponding path for maps.

There may be additional directories for caches, internal data files and other things that players typically do not interact with.

The snap packager could expose the relevant information to the user with the following, tentative, YAML declaration in the snap.yaml file:

name: gamename
...
user-data-presentation:
  saves:
    path $SNAP_USER_DATA/.gamename/saves
  maps:
    path: $SNAP_USER_COMMON/.gamename/maps

The syntax is a mapping of top-level names that map to real path names from the point of view of the software. In the language above I’ve used two data sources - one being specific to the current revision and one that is common across revisions. From the user point of view there is no difference but from the packager point of view it may be beneficial to store some of the data in the common directory, for instance, because it is particularly large and the format is stable across revisions.

There are many more ideas we could explore here. We could expose both folders and regular files. We could offer localized names, so that a player from Poland might see the names sejwy and mapy respectively. We could allow the snap packager to offer a special icon that is attached to the generic folder icon and use that icon when displaying the folder ~/snap/$SNAP_NAME. This part is explicitly open-ended as there are many ideas we could explore.

Before jumping into the technical side we could also consider what happens to packages do not contain the new YAML language. We could present them as we do currently. We could choose not to present them at all, encouraging adoption and giving packagers a way to say that a particular snap does not have any user data to present at the same time. This is something to discuss later.

The remainder of this post is a technical overview of the idea.

The central component of the idea is the new FUSE filesystem, snapdatafs, which runs in userspace in the session of a particular user. With the state of the desktop system interaction we could meaningfully support this in 18.04+, perhaps also in 16.04 but I did not evaluate it enough to be sure.

The filesystem is mounted on demand in the user session, it could be also not mounted at all, if the user strongly prefers that option. The actual name of the mount point could also be distinct from ~/snap. It could be as simple as ~/Snap (note the upper case) or anything more sophisticated.

The filesystem would consume a collection of meta/snap.yaml files. Most likely aggregated by snapd into a trusted location that is available regardless of the state of the actual snaps. In my prototype I used /var/lib/snapd/snapdatafs/v1.yaml. The application loads the file and establishes notification of file changes. The advantage of monitoring one file is that Linux filesystem notification APIs are rather unreliable and scale poorly - handling one aggregate file is much easier to do correctly.
The aggregate file is prepared by snapd and can be also used as a validation/post-processing point, providing a way to control the evolution of the format and compatibility with the file system.

The FUSE filesystem would present synthetic structure comprising the contents of today’s ~/snap/. Running with the game idea mentioned earlier it would create this synthetic structure:

gamename/ - directory
gamename/saves - directory
gamename/saves/* - bridge to actual backing store
gamename/maps - directory
gamename/maps/* - bridge to actual backing store

Initially the implementation could simply implement the bare minimum required to be a FUSE filesystem - provide read / write access to bridged files and refusing modifications of the fixed scaffolding.

As the implementation matures and we gain some practical experience we could optimize the I/O by using file descriptor offloading. FUSE could open the actual file and hand off the file descriptor to the kernel, thereby providing native performance to the non-synthetic sections.

There are some open questions I did not have time to get answers for:

how to handle refreshes when the user is browsing, say the maps folder and the new revision removes that presentation. We would certainly have to handle some sort of corner cases where the snap is no longer presenting something but we have a folder open to the old view.

In some sense the new user-space filesystem could participate in refresh-app-awareness. Contributing information about business of a particular location. Definite answers would require more advanced prototype where that could be determined.

sergiusens · April 28, 2020, 6:39pm

Looks interesting, this does seem to make the developing side a bit more complicated (I won’t see the same thing on the outside than on the inside, unless I snap run). A few questions come up:

are there performance implications with this fuse mount for systems with many, let’s say ~100, snaps installed?
how will this work with classic confinement?

pedronis · April 28, 2020, 6:42pm

to be clear this is a RFC at this point, and nothing is set about it

jdstrand · April 28, 2020, 7:52pm

@zyga-snapd - IIUC, the basic idea is that the location of what is currently ~/snap becomes an implementation detail that the application and (hopefully) the user will not care about (so long as the application uses the SNAP_USER_DATA and SNAP_USER_COMMON env vars). In this manner, we could move ~/snap to ~/.snap-data (or whatever) and it is then FUSE in the user’s session that will provide the view of ~/.snap-data (as defined by the snap, or TBD default behaviors) at ~/snap (or anywhere else the user chooses). Is that an accurate summary?

Assuming it is, it is easy enough to adjust our default apparmor template, et al to move ~/snap to ~/.snap-data, but:

how do you envision interactions with the home interface?
- Will snaps launched from within the user’s session see this FUSE view? (they should not-- the policy is currently designed to not allow reading other snap’s data)
- Are you thinking that the per-snap mount namespace will omit this from the snap’s view of home? (preferred)
- If not omitting, is the idea that the FUSE mountpoint for the view is configured via the ‘snap’ command so that the system knows how to dynamically adjust the policy on the fly? (possible, but a bit messy and there might be some (surmountable) implementation difficulties with dynamically generating the AARE rules for carving out a path with UTF-8 characters)
will the fact that users having two views of snap data potentially cause confusion since the data can be accessed from two places (the real location and the FUSE view that can change)? Eg, what is the developer’s documentation experience when communicating to their users?

elcste · April 28, 2020, 9:39pm

As a snap user, and one who sometimes helps other snap users, I think it could get more confusing for users to access data that the snap creator doesn’t consider “relevant”. Now there are are two places to look: the FUSE location for (using your example) Saves and Maps, but the backing store location for any other data, which may be important to the user despite the snap creator thinking otherwise. Maybe there should be an “etc” (or whatever) link that would point towards the other files, excepting the linked-up “relevant” files/dirs?

ijohnson · April 28, 2020, 10:38pm

I agree with this point, if there are users who are picky and don’t want “~/snap” in their home directory, then it’s also likely there will be other users (possibly same group of users) that won’t want to partake in the specific layout of the files that snap XYZ chooses to expose in their home directory and will want to re-arrange the files that snap XYZ puts in their home directory, etc.

I think what would be helpful here (unfortunately) is again a design that can take input from users on how users want to layout their files. Something like interfaces, or aliases that have default configuration but that can be reconfigured as a user sees fit. Strawman proposal would be that a snap exposes file “plug-like things” that by default will connect to “slot-like things” generated from whatever’s in the snap.yaml (i.e. your snap author example has “$HOME/gamedata/saves” and “$HOME/gamedata/maps”), but that could be reconfigured by a user to live somewhere else, so a user could do something like

snap move-files mygame gamedata/saves BestGameEver/saves

which then in the FUSE would reconfigure it to put the “saves” folder in $HOME/BestGameEver/saves instead of the default $HOME/gamedata/saves.

zyga-snapd · April 29, 2020, 6:36am

This filesystem would not be used by snaps. Snaps would always use the real filesystem. In my prototype it was moved to ~/.snapdata.

This is just for humans

zyga-snapd · April 29, 2020, 9:21am

jdstrand:

how do you envision interactions with the home interface?

Will snaps launched from within the user’s session see this FUSE view? (they should not-- the policy is currently designed to not allow reading other snap’s data)

Are you thinking that the per-snap mount namespace will omit this from the snap’s view of home? (preferred)

If not omitting, is the idea that the FUSE mountpoint for the view is configured via the ‘snap’ command so that the system knows how to dynamically adjust the policy on the fly? (possible, but a bit messy and there might be some (surmountable) implementation difficulties with dynamically generating the AARE rules for carving out a path with UTF-8 characters)

will the fact that users having two views of snap data potentially cause confusion since the data can be accessed from two places (the real location and the FUSE view that can change)? Eg, what is the developer’s documentation experience when communicating to their users?

Snaps will not interact with the FUSE directory explicitly. They may implicitly do so if it is mounted and the user picks a file from that directory. When that happens they would currently get a denial. If the confinement profile knew about the path of the mount point we could allow it similarly to how ~/snap is allowed today, namely by opening ~/snap/$SNAP_INSTANCE_NAME/*.

The per-snap mount namespace could easily unmount it but would not be shielded from the mount point re-appearing (unless we make all of home private so that no future mount show up).

Using a dynamically adjusted snap name is indeed desired and that would be my preference. There are some things to consider (e.g. this is a per-user preference, which complicates things).

One other idea we could do is to provide a FUSE filesystem that is also mounted in the per-snap mount namespace, that only shows the data of this snap (therefore making `~/[^.]/**`` safe to access.

As for the last question, I’m not sure. My idea is that developers will only present the kind of data that is meaningful to interact with from a file manager and that this is still used in conjunction with the home interface, so that commonly used data is actually stored in places like ~/Documents or ~/Downloads. Whatever we come up with will be a new experience that developers and users will have to adjust to.

sergiusens · May 6, 2020, 9:46pm

Is this considering the scenario that considering this data is the presented data, that this data won’t go away on snap removal or will backups be split between user generated data and internal snap data?

zyga-snapd · May 7, 2020, 9:50pm

In my mind this was always a method to present existing data. I did not consider the scenario where a snap is removed and part of its data stays behind because it is presented.