System user lookup

We need to think about system user lookup for a bit.

Currently we’re doing it two ways: for some operations we scan /home, and for others we use os/user‘s Lookup function (or Current which calls that). There being two ways to answer essentially the same questions can result, in some situations, in things getting out of sync (think: old users’ homes left over in /home, or some user homes being in /home2, or local users but NFS homes, or network users with local homes, or…).

Both of these implementations have the problem that “list all users” is not a question we can ask, as they could lock up the system for a long time (I’ve worked on systems where a full user listing can take hours; although pathological, it points at the problem).

os/user uses cgo, which we try to avoid, and there’s a PR up which vendors the non-cgo implementation from 1.9 (which I expect will get tweaked to support extrausers, otherwise it breaks on core).

If just looking up users (and groups) in files is enough for snapd, we should own it and switch everything to that (and we get to ask for all users! some UX gets a lot nicer). If not, we should own it and fix things.

What do we want to do?

Basically the question is: do we care about, and support, the weird and wonderful cases a system can get into via NSS and networked homes? If yes, we cannot use non-cgo implementations of these lookups, we cannot use reading /home/*/snap to get a listing of snap user dirs, and we can’t ask for all the users in a system.

If we don’t care nor support generic NSS and only support files, we can use a non-cgo implementation of the lookup funcs, and as long as we don’t list /home we shouldn’t need to worry about it being networked.

If we don’t care about NSS, we can still kinda-support its users by supporting something like extrausers but for network users. Hackish but workable. Sysadmin needs to update /var/lib/snapd/networkusers or something to keep it in sync with local, like yp of old.

Thanks for raising the issue.

I think we definitely care about the fact a class of users will be using networked setups, and we cannot have a system that prevents those cases from working altogether for any software carried as a snap.

At the same time, the text above suggests that as long as that’s true the only way out is using cgo, which isn’t true. At the very least, we can also leverage system tools to access that functionality, and we can explore what’s the best way to do that.

To make a good decision on this, it would be useful to break down the actual use cases we have. That is, in which cases do we look users up, and what’s the danger in not having precise information for it.

As a side note, this topics seems to have mistakenly marked as unlisted.

in order to… we need to… currently done via
create SNAP_USER_DATA etc look up a user’s home user.Current
copy a snap’s data on upgrade iterate over all users of the snap /home/*/snap
snapshot a snap’s data iterate over all users of the snap
ensure a given local user exists lookup a user, create it if missing user.Lookup; adduser(8)
manage a user’s credentials look up a user’s home user.Lookup

there’s probably more :wink:

1 Like

WRT using system tools, yes, and that’s a third approach that also doesn’t let you iterate over all users.

Very nice analysis, thanks!