Cannot start some applications after upgrade to snapd-2.53.1 on OpenSuse Tumbleweed

Hi,

I am new here, so please excuse if I am missing important information on this request.

I am using snapd on OpenSuse/Tumbleweed on my Dell notebook. Mainly to run Teams-For-Linux and Postman. Until two days ago everything was great. Then during a “zypper dup” snapd was upgraded to version 2.53.1:

mknoblau@c3b4m33lx:~> snap version
snap                 2.53-1.3
snapd                2.53-1.3
series               16
opensuse-tumbleweed  20211011
kernel               5.13.7-1-default

Since then both teams-for-linux and postman do not start anymore. The small Hello-World snap works fine. Symptom of the failing snaps:

mknoblau@c3b4m33lx:~> /snap/bin/hello-world
Hello World!
mknoblau@c3b4m33lx:~> /snap/bin/teams-for-linux 
cannot load program:

: Invalid argument
mknoblau@c3b4m33lx:~> 

I am not sure whether this is an snapd issue, or something else that happened during the “zypper dup”, so I am a bit at a loss here. Folks at Tumbleweed are silent about it.

Any one seen this recently? Any hints? Any advice on how to debug this?

Thanks a lot in advance Martin PS: and of course I did that upgrade on both my working setups … :frowning: :face_with_symbols_over_mouth:

After some looking on the forum, I found some advised debug commands. So here is the output from them:

mknoblau@c3b4m33lx:~> snap debug confinement
partial

mknoblau@c3b4m33lx:~> snap debug sandbox-features
apparmor:             kernel:caps kernel:domain kernel:file kernel:mount kernel:namespaces kernel:network_v8 kernel:policy kernel:ptrace kernel:query kernel:rlimit kernel:signal parser:qipcrtr-socket parser:unsafe policy:default support-level:partial
confinement-options:  classic devmode
dbus:                 mediated-bus-access
kmod:                 mediated-modprobe
mount:                layouts mount-namespace per-snap-persistency per-snap-profiles per-snap-updates per-snap-user-profiles stale-base-invalidation
seccomp:              bpf-actlog bpf-argument-filtering kernel:allow kernel:errno kernel:kill_process kernel:kill_thread kernel:log kernel:trace kernel:trap kernel:user_notif
udev:                 tagging

mknoblau@c3b4m33lx:~> SNAPD_DEBUG=1 SNAP_CONFINE_DEBUG=1 snap run postman
2021/10/14 11:41:05.341540 tool_linux.go:68: DEBUG: re-exec not supported on distro "opensuse-tumbleweed" yet
2021/10/14 11:41:05.350544 cmd_run.go:425: DEBUG: SELinux not enabled
2021/10/14 11:41:05.351194 tracking.go:45: DEBUG: creating transient scope snap.postman.postman
2021/10/14 11:41:05.352266 tracking.go:185: DEBUG: using session bus
2021/10/14 11:41:05.354628 tracking.go:317: DEBUG: created transient scope as object: /org/freedesktop/systemd1/job/211
2021/10/14 11:41:05.357021 tracking.go:145: DEBUG: waited 2.351181ms for tracking
DEBUG: umask reset, old umask was  022
DEBUG: security tag: snap.postman.postman
DEBUG: executable:   /usr/lib/snapd/snap-exec
DEBUG: confinement:  non-classic
DEBUG: base snap:    core18
DEBUG: ruid: 15833, euid: 0, suid: 0
DEBUG: rgid: 111, egid: 111, sgid: 111
DEBUG: apparmor label on snap-confine is: /usr/libexec/snapd/snap-confine
DEBUG: apparmor mode is: enforce
DEBUG: creating lock directory /run/snapd/lock (if missing)
DEBUG: set_effective_identity uid:0 (change: no), gid:0 (change: yes)
DEBUG: opening lock directory /run/snapd/lock
DEBUG: set_effective_identity uid:0 (change: no), gid:111 (change: yes)
DEBUG: opening lock file: /run/snapd/lock/.lock
DEBUG: set_effective_identity uid:0 (change: no), gid:0 (change: yes)
DEBUG: set_effective_identity uid:0 (change: no), gid:111 (change: yes)
DEBUG: sanity timeout initialized and set for 30 seconds
DEBUG: acquiring exclusive lock (scope (global), uid 0)
DEBUG: sanity timeout reset and disabled
DEBUG: ensuring that snap mount directory is shared
DEBUG: unsharing snap namespace directory
DEBUG: set_effective_identity uid:0 (change: no), gid:0 (change: yes)
DEBUG: set_effective_identity uid:0 (change: no), gid:111 (change: yes)
DEBUG: releasing lock 5
DEBUG: opened snap-update-ns executable as file descriptor 5
DEBUG: opened snap-discard-ns executable as file descriptor 6
DEBUG: creating lock directory /run/snapd/lock (if missing)
DEBUG: set_effective_identity uid:0 (change: no), gid:0 (change: yes)
DEBUG: opening lock directory /run/snapd/lock
DEBUG: set_effective_identity uid:0 (change: no), gid:111 (change: yes)
DEBUG: opening lock file: /run/snapd/lock/postman.lock
DEBUG: set_effective_identity uid:0 (change: no), gid:0 (change: yes)
DEBUG: set_effective_identity uid:0 (change: no), gid:111 (change: yes)
DEBUG: sanity timeout initialized and set for 30 seconds
DEBUG: acquiring exclusive lock (scope postman, uid 0)
DEBUG: sanity timeout reset and disabled
DEBUG: initializing mount namespace: postman
DEBUG: setting up device cgroup
DEBUG: libudev has current tags support
DEBUG: adjusting memlock limit to 524288
DEBUG: set_effective_identity uid:0 (change: no), gid:0 (change: yes)
DEBUG: set_effective_identity uid:0 (change: no), gid:111 (change: yes)
DEBUG: cgroup /sys/fs/cgroup//user.slice/user-15833.slice/user@15833.service/app.slice/snap.postman.postman.f4baafb1-d268-4871-84d8-70742fdbbc46.scope opened at 8
DEBUG: set_effective_identity uid:0 (change: no), gid:0 (change: yes)
DEBUG: get bpf object at path /sys/fs/bpf/snap/snap_postman_postman
DEBUG: set_effective_identity uid:0 (change: no), gid:111 (change: yes)
DEBUG: found existing device map
DEBUG: get next key for map 9
DEBUG: found 0 existing entries in devices map
DEBUG: set_effective_identity uid:0 (change: no), gid:0 (change: yes)
DEBUG: load program of type 0xf, 33 instructions
cannot load program:

: Invalid argument
mknoblau@c3b4m33lx:~>

looks like a problem with the cgroupsv2 work recently added, one for @mborzecki when he is back I think

There’s something offa bout the kernel you’re running:

Whereas I have:

snap                 2.53-1.4
snapd                2.53-1.4
series               16
opensuse-tumbleweed  20211012
kernel               5.14.9-1-default

Pretty sure the previous snapshot had 5.14.9 kernel too.

What arch is that? My openSUSE laptop is x86_64 and posman, ohmygiraffe, spotify all work fine. The CI testing is also done on the same architecture and none report problems. I’ve also tried to procure the conditions like you have, where there are no entries in the device map, and the apps start as usual.

I think I’d start by zypper dup and making sure that the kernel is updated. I only know of one distribution that uses 5.13 where cgroups v2 are enabled and devices are filtered using BPF, and it’s Ubuntu 21.10. I’m also assuming that the kernel was patched significantly. My other box which runs Arch, has 5.14.12, and Debian Sid in our testing is on 5.14.9.

The next thing to check is apparmor. On openSUSE the audit messages are actually consumed by auditd, so please attach the output of sudo ausearch -m AVC.

Although the error message suggest the problem may be elsewhere, and the error log obtained from the kernel is empty :confused:

Hi Maciej,

thanks for coming back to me. The kernel on that laptop is self built. And actually I am now on 5.14-12 as well. The laptop is a Dell Precision 7540 (x86_64) - nothing exotic. The installation is up2date with the exception of the kernel.

mknoblau@c3b4m33lx:~> snap version
snap                 2.53-1.4
snapd                2.53-1.4
series               16
opensuse-tumbleweed  20211012
kernel               5.14.12-1-default
mknoblau@c3b4m33lx:~> uname -a
Linux c3b4m33lx 5.14.12-1-default #1 SMP PREEMPT Thu Oct 14 09:21:30 CEST 2021 x86_64 x86_64 x86_64 GNU/Linux

So personally I doubt the kernel is at fault, but as a last resort I certainly can switch back to the distro kernel.

But then there is apparmor:

----
time->Fri Oct 15 08:58:29 2021
type=AVC msg=audit(1634281109.652:310): apparmor="DENIED" operation="capable" profile="/usr/libexec/snapd/snap-confine" pid=18052 comm="snap-confine" capability=39  capname="bpf"
----
time->Fri Oct 15 08:58:29 2021
type=AVC msg=audit(1634281109.653:311): apparmor="DENIED" operation="capable" profile="/usr/libexec/snapd/snap-confine" pid=18052 comm="snap-confine" capability=12  capname="net_admin"

Seems something is missing here. Let me investigate

Cheers Martin

Let’s see if this may be a problem. Can you edit /etc/apparmor.d/usr.libexec.snapd.snap-confine, and add capability bpf in there? You can add it after capability sys_admin, so that the content is like:

    ....
    capability sys_admin,
    capability bpf,
    capability dac_read_search,
    capability dac_override,
    ...

The run systemctl restart apparmor and try to start the snap again.

OK, some new effects … So I added “bpf” and restarted apparmor. That made the denied for “bpf” go away, but no change in problem. Same after adding “net_admin”. No more denied messages, but postman/teams-for-linux still fail to start.

So I finally decided to install the current distro kernel from tumbleweed (5.14.9):

mknoblau@c3b4m33lx:~> snap version
snap                 2.53-1.4
snapd                2.53-1.4
series               16
opensuse-tumbleweed  20211012
kernel               5.14.9-1-default

That made the difference. “postman” now comes up. So, something in my kernel configuration seems to be missing/odd. Is there anything that has to be enabled/defined for snap to run? Or should not be enabled/defined? If you have cycles, I am happy to send you my configuration.

Unfortunately this is not the end of the story, because “teams-for-linux” still fails (never failed before the system upgrad four days ago) with the following symptom:

mknoblau@c3b4m33lx:~> teams-for-linux 
/snap/teams-for-linux/182/teams-for-linux: error while loading shared libraries: libxshmfence.so.1: cannot open shared object file: No such file or directory

That library is installed in /usr/lib64. And the weird thing: if I start he executable directly, it comes up !!!

So, one final last guess: remove all snaps and reinstall. Now it works !!!

Check if CONFIG_CGROUP_BPF=y is in your configuration, also a more general CONFIG_BPF=y and CONFIG_BPF_SYSCALL=0 although those are likely enabled. I suggest taking a look at https://build.opensuse.org/package/show/Kernel:stable/kernel-source specifically the config file and look for BPF and cgroups.

Those are different things. First, a library in /usr/lib64 on the host is not the same as that library being inside the snap. A snap runs inside its own mount namespace (think chroot on steroids). Secondly, AFAICT teams-for-linux is a strictly confined snap, so the fact that you ran the binary directly and it work is just an accident, as it’s not guaranteed to work this way (that’s what snaps are for), and it’s unconfined at this point. IIRC my kids use teams snap which is published directly by Microsoft Teams team.

Hi Maciej,

so in the end it was CONFIG_CGROUP_BPF=y missing from my configuration. Adding it and rebuilding my kernel makes “teams-for-linux” and “postman” come up OK again. So in the end it was my own fault :face_with_symbols_over_mouth: Really thanks a big lot for your support.

Why I had to reinstall the snaps is not clear to me. I repeated the kernel exercise on my second system and there it was not neccessary to reinstall the snaps. But I am not complaining.

So, just to conclude this, on the second system I checked the apparmor messages. There were some “DENIED”, but they do not seem to hurt:

----
time->Fri Oct 15 12:19:33 2021
type=AVC msg=audit(1634293173.968:283): apparmor="DENIED" operation="capable" profile="/usr/libexec/snapd/snap-confine" pid=14803 comm="snap-confine" capability=39  capname="bpf"
----
time->Fri Oct 15 12:19:33 2021
type=AVC msg=audit(1634293173.968:284): apparmor="DENIED" operation="capable" profile="/usr/libexec/snapd/snap-confine" pid=14803 comm="snap-confine" capability=12  capname="net_admin"
----
time->Fri Oct 15 12:19:33 2021
type=AVC msg=audit(1634293173.968:285): apparmor="DENIED" operation="capable" profile="/usr/libexec/snapd/snap-confine" pid=14803 comm="snap-confine" capability=38  capname="perfmon"
----
time->Fri Oct 15 12:19:34 2021
type=AVC msg=audit(1634293174.106:287): apparmor="DENIED" operation="capable" profile="/usr/libexec/snapd/snap-confine" pid=14803 comm="snap-confine" capability=4  capname="fsetid"
----

The complaint for “fsetid” only came once. The other three come every time I start “teams-for-linux”. But as I said, they do not seem to hurt.

Have a good weekend Martin

1 Like