Snapd refresh to 2.67 broken/did not create seccomp profile

On an ubuntu 22.04.5 system, /usr/bin/lxc lost the ability to talk with the lxd server. Restarting lxd server found that seccomp profile bin2s were missing. Found that snapd snap had refreshed to 2.67 revision (23545). I reverted back to 2.66.1 (23258), which generated the profiles fine. Refreshing to 2.67 deleted the profiles without generating new ones.

Looks a lot like

Should I open a bug report somewhere?

How can I debug this/can I prevent this from happening again?

Please, collect the journal logs of snapd.service and snap.lxd.daemon.service covering the time span of the refresh/startup. journalctl -u snapd.service --no-pager will do.

hth, gotta go now

Jan 10 15:30:55 lxc systemd[1]: /lib/systemd/system/snapd.service:23: Unknown key name 'RestartMode' in section 'Service', ignoring.
Jan 10 15:31:01 lxc snapd[1507]: daemon.go:548: gracefully waiting for running hooks
Jan 10 15:31:01 lxc snapd[1507]: daemon.go:550: done waiting for running hooks
Jan 10 15:31:04 lxc snapd[1507]: overlord.go:518: Released state lock file
Jan 10 15:31:04 lxc systemd[1]: snapd.service: Deactivated successfully.
Jan 10 15:31:04 lxc systemd[1]: snapd.service: Consumed 3min 27.017s CPU time.
Jan 10 15:31:04 lxc systemd[1]: snapd.service: Scheduled restart job, restart counter is at 1.
Jan 10 15:31:04 lxc systemd[1]: Stopped Snap Daemon.
Jan 10 15:31:04 lxc systemd[1]: snapd.service: Consumed 3min 27.017s CPU time.
Jan 10 15:31:04 lxc systemd[1]: Starting Snap Daemon...
Jan 10 15:31:05 lxc snapd[2417788]: overlord.go:274: Acquiring state lock file
Jan 10 15:31:05 lxc snapd[2417788]: overlord.go:279: Acquired state lock file
Jan 10 15:31:06 lxc snapd[2417788]: patch.go:64: Patching system state level 6 to sublevel 1...
Jan 10 15:31:06 lxc snapd[2417788]: patch.go:64: Patching system state level 6 to sublevel 2...
Jan 10 15:31:06 lxc snapd[2417788]: patch.go:64: Patching system state level 6 to sublevel 3...
Jan 10 15:31:06 lxc snapd[2417788]: daemon.go:250: started snapd/2.67 (series 16; classic) ubuntu/22.04 (amd64) linux/5.15.0-126-generic.
Jan 10 15:31:06 lxc snapd[2417788]: daemon.go:353: adjusting startup timeout by 45s (pessimistic estimate of 30s plus 5s per snap)
Jan 10 15:31:06 lxc snapd[2417788]: backends.go:58: AppArmor status: apparmor is enabled and all features are available (using snapd provided apparmor_parser)
Jan 10 15:31:08 lxc snapd[2417788]: helpers.go:235: cannot regenerate seccomp profiles
Jan 10 15:31:08 lxc snapd[2417788]: helpers.go:237: cannot compile /var/lib/snapd/seccomp/bpf/snap.lxd.hook.install.src: error: cannot parse line: cannot parse token "g:root" (line "chown - u:root g:root"): group: unknown group root
Jan 10 15:31:10 lxc systemd[1]: Started Snap Daemon.

Jan 10 15:31:08 lxc snapd[2417788]: helpers.go:237: cannot compile 
/var/lib/snapd/seccomp/bpf/snap.lxd.hook.install.src: error:
cannot parse line: cannot parse token "g:root"
(line "chown - u:root g:root"): group: unknown group root

Can you run grep root: /etc/group and then getent group root and paste the outputs?

thanks; i found the groups file is botched

root@lxd ~ # getent group root
error writing group entry: Invalid argument
root@lxd ~ # groups
root
root@lxd ~ # grep root /etc/group
root:x:0:daemon:x:1:

Looks like a missing newline. This should be:

root:x:0:
daemon:x:1:

Its a bit weird; the group file was botched for since at least 2023; the getent binary did not change (all from the snaps and the host are from may 2024). the working .src file is mostly the same as it is on a debian with snapd 2.67 So i assumed the seccomp compiler changed, but couldnt verify that thesis.

On both the ubuntu with 2.66 and on a debian with 2.67 the chown line in the install hook src is chown - -1 g:root. Its the first with line with - -1 g:root before that there are several lines with - u:root g:root ; i did not find relevant changes in the snapd/cmd/snap-seccomp diff 2.66…2.67

I assume the error message is bogus? Or did the debian vs the ubuntu system generate a different src file for snapd 2.67 ?

edit2: I failed to compile the current snap.lxd.hook.install.src (ubuntu snapd 2.66.1+22.04) both with /snap/snapd/23258/usr/lib/snapd/snap-seccomp and /snap/snapd/23545/usr/lib/snapd/snap-seccomp with the broken etc/group file So this seems to have been a problem before; will need to see if this resulted in failure in the end or not next week.

I will probably try to move to 2.67 again next week and report success/failure and have a look at the resulting src file

Since 2.67 snapd relies on getent for information on users and groups. Previously it used a built in Go standard library feature which interacted with glibc on the host (and had a fallback to a custom parser in Go). However, this is ultimately unreliable if the host is configured to use elaborate nss configuration.

In any case, if the content you pasted above is what you had in /etc/groups, then it is clearly invalid and getent correctly complains about that. The correct format is described in https://www.man7.org/linux/man-pages/man5/group.5.html and it only worked before by accident.

1 Like

seems that already happened in 2.66 (can see related changes the git log and the error already happening 2024-11-27 in the snapd.service journal) i guess either 2.67 deletes the old files (and 2.66.1 did not) or the service did not get restarted since then

no argument about that.

and thanks a lot for your help, its very much appreciated

Ah yes, sorry, I was thinking about a different chunk of this work. The bulk of this landed for 2.66. We landed a fix in 2.67 which ensures that the snap command can locate getent even if PATH is unset.