Bug with cleaning snap data in home dirs + proposed solution


#1

The problem
The cleanup of $HOME/snap/ directories on snap removal is currently very simplistic, it’s based on a “/home/*/snap/…” glob which means actual home directory as set in passwd is not honored and there are leftovers in the filesystem when snap gets removed (e.g. we don’t remove data from /root user dir).

Proposed solution
I’ve discussed this issue with @mvo and @chipaca at the last week’s sprint. In the discussion it was understood that a simple iteration over system users using system API (getpwent and alike) is a bad idea, because it will get very problematic performance-wise if e.g. LDAP with large user base is involved.
Therefore the following solution was proposed:

  • we will create a new directory /var/lib/snapd/user/ and set sticky-bit on it.
  • whenever snap-run is exectued and creates snap directories in user’s home, it will drop a single empty /var/lib/snapd/user/<UID> file for that user. Only one file will be created per-user.
  • The `/var/lib/snapd/user/*’ files will serve as an efficient list of all active users, and we will only need to look up for those UIDs in system’s user db when cleaning data up on snap removal.

Comments, opinions? @niemeyer, @jdstrand does that sound fine? If so than I’m happy to prepare a PR for this.


Snap remove doesn't remove data from /root/snap/$SNAP_NAME
#2

I’m not sure exactly how we’d handle this for the SELinux policy.

Currently, we’re leveraging the special mechanism with file context policies to globally set $HOME/snap to the snap home directory so that snapd has read/write access to it without having the ability to mutate the rest of the home directory and granting all user applications free access to the content (as they are supposed to).

The /var/lib/snapd directory is marked with the snappy_var_lib_t context, which allows for snapd read/write. I’m not sure how setting a separate path for /var/lib/snapd/user/<UID>/ as snappy_home_t will work without conflicting with the original directive.

It also sounds like you want to give snapd too much awareness of the user state. In my opinion, user data is just that, user data, and should be handled by the user. If snapd needs intelligence for this, then I would expect it to grow some kind of command that can run to examine enumerated snap home dirs and erase the right stuff on uninstall.

In addition to that, your proposal makes user data private even from the user, which I don’t think is the right intent here.


#3

I didn’t follow you there; could you expand on it?


#4

There is no guarantee that the user can access content in /var. In many cases, the user cannot access that region at all.

It also occurs to me that this adds complexity for supporting certain types of common use cases, and at least one use case might be completely broken by this: Consider the case where the user’s home directory is a shared NFS mount. If you move the user’s content out of the home directory (that is shared) to the OS disk (which is not), it is effectively machine-private. That is very broken…


#5

Hi Neal, thanks for your feedback!

Yes, I understand your point, but I’m not sure we want another command that user needs to be aware of. Just to be clear, snapd is already accessing user stuff (thus we have the problem), it’s just very naive about it…

I’m not sure I get the point of this example. Is this about a case where you have two machines with network-shared home, both having same snap foo installed, then we remove the snap on one of them and this removes stuff from user’s home even though it’s still relevant for the second machine? If so then yes, this model breaks and the only solution is to have a dedicated command for user to execute as you suggested above.


#6

The issue is that in the mass-provisioned case (which is quite common for computer labs and whatnot), not having user data in the networked home directory of the user means that user preferences and data do not persist across multiple machines. Breaking this use-case is unacceptable in my eyes.


#7

I got you, that’s a valid concern. I’ll raise that if we discuss this matter outside of the forum topic. Thanks.


#8

Ok, after discussing this on the standup the decision is to not introduce this kind of change and fix it with the introduction of snapshots in near future.


#9

Re-reading this topic, I’m puzzled by a couple of things.

First, @Conan_Kudo, why would us having a per-uid entry in a database imply the user of that uid should read that database? Sure, the database is a directory and the entry is an (empty) file, so that it’s easier for things other than ourselves (think: post-removal scripts) to operate on, but it’s a database. I don’t see the need for the user to access that information; it’s a note to us “this user used the system”, nothing more (or less). What am I missing?

Second, @pstolowski, do you remember how we said snapshots would address this issue? (I can’t connect these things now, looking back).


#10

@Chipaca I tried to recall the details, but no luck. I think (AFAIR) we talked about snapshots mostly as a way of avoiding nasty surprises to the user (oh,my data is gone!). Not sure how this helps with networked homes though.


#11

Guys this is kind of a big deal. Some things (e.g. ROS) log to $HOME, which is $SNAP_USER_DATA. It must be this way because it’s the only directory that is writable regardless of whether the application is a service (running as root) or an app (running as a user).

@cratliff just showed me a huge directory of ROS logs in /root/snap. There’s like a hundred revisions in there, that adds up quickly.

Another related issue: these are not migrated to the next revision.


#12

they should all be empty dirs though, each core update moves the content forward, all you lose there are inodes for the directory entries (this is definitely ugly and awful, but surely not fatal) if you actually find leftover content in subdirs of already removed snaps thats a serious bug …


#13

Yeah that does not seem to be the case. There’s content there.


#14

Revisiting this, after a long hiatus, I’m going to summarise the goals, non-goals, and the proposal as it stands:

What we want, for cleanup for snap data on snap removal and other things, is an easy and efficient way of enumerating users of snaps on a system that does not involve enumerating system users. Currently we do this via /home/*/snap globbing, which gives both false negatives and positives, and is terribly inefficient with large networked homes (especially if automount is involved).

We want to use it for:

  1. user data operations (currently done via /home/*/snap glob: snapshots, release copying, cleanup on removal, etc)
  2. user hooks (not currently done)
  3. … anything else that needs to be done “for every user that has used snaps”

We do not want to use it for:

  1. negative user data cleanup

    This would involve running a difference between a system user enumeration and the snap user enumeration, which fails down in a number of cases involving networked users and homes. If a stamp file isn’t there, a user has not used snaps on this system and so doesn’t exist as a snap user; end of story.

Proposal

  1. Have snap run drop a stamp file in a sticky directory, as the user. Like touch /var/lib/snapd/users/$USER (or maybe echo $HOME > /var/lib/snapd/users/$USER) but writ in Go.
  2. Change everything that today looks at dirs.SnapDataHomeGlob (and (*snap.Info).DataHomeDir and .UserDataDir, as well as their Common variants) to instead iterate over /var/lib/snapd/users/*.
  3. Have a minor patch that creates /var/lib/snapd/users/ and populates it from /home/*/snap, to transition.

some handwaving there but hopefully it’s concrete enough to advance (and settle!) the discussion again.