Ubuntu Core 22 snapd fails to start

Dear all,

Suddenly my Snap Daemon stopped starting, and I don’t know how to debug. The logs repeatedly look like this:

Feb 15 15:16:53 ubuntu systemd[1]: Starting Snap Daemon...
Feb 15 15:16:53 ubuntu snapd[2900]: overlord.go:271: Acquiring state lock file
Feb 15 15:16:53 ubuntu snapd[2900]: overlord.go:276: Acquired state lock file
Feb 15 15:18:23 ubuntu systemd[1]: snapd.service: start operation timed out. Terminating.
Feb 15 15:19:53 ubuntu systemd[1]: snapd.service: State 'stop-sigterm' timed out. Killing.
Feb 15 15:19:53 ubuntu systemd[1]: snapd.service: Killing process 2900 (snapd) with signal SIGKILL.
Feb 15 15:19:53 ubuntu systemd[1]: snapd.service: Main process exited, code=killed, status=9/KILL
Feb 15 15:19:53 ubuntu systemd[1]: snapd.service: Failed with result 'timeout'.
Feb 15 15:19:53 ubuntu systemd[1]: Failed to start Snap Daemon.
Feb 15 15:19:53 ubuntu systemd[1]: snapd.service: Scheduled restart job, restart counter is at 3.
Feb 15 15:19:53 ubuntu systemd[1]: Stopped Snap Daemon.

I don’t know how to debug this.

The output of systemctl status is

● ubuntu
    State: degraded
     Jobs: 1 queued
   Failed: 4 units
    Since: Thu 2024-02-15 15:10:47 UTC; 1h 5min ago
   CGroup: /
           ├─user.slice 
           │ └─user-1000.slice 
           │   ├─user@1000.service …
           │   │ └─init.scope 
           │   │   ├─5086 /lib/systemd/systemd --user
           │   │   └─5087 (sd-pam)
           │   └─session-1.scope 
           │     ├─5082 sshd: myuser [priv]
           │     ├─5101 sshd: myuser@pts/0
           │     ├─5103 -bash
           │     ├─8803 sudo systemctl status
           │     ├─8804 sudo cat
           │     ├─8805 sudo cat
           │     ├─8806 sudo systemctl status
           │     ├─8807 cat
           │     └─8808 systemctl status
           ├─init.scope 
           │ └─1 /usr/lib/systemd/systemd --switched-root --system --deserialize 30
           └─system.slice 
             ├─systemd-udevd.service 
             │ └─563 /lib/systemd/systemd-udevd
             ├─snap.azure-iot-identity.identityd.service 
             │ ├─8665 /bin/sh /snap/azure-iot-identity/21/snap/command-chain/launch-wrapper.sh /snap/azure-iot-identity/21/libexec/aziot-identityd
             │ └─8683 snapctl get log-level
             ├─snap.alsa-utils.alsa-restore.service 
             │ ├─1510 /bin/bash /snap/alsa-utils/68/bin/alsa-stated
             │ └─1800 sleep infinity
             ├─snap.pulseaudio.pulseaudio.service 
             │ └─1515 /snap/pulseaudio/65/usr/bin/pulseaudio --exit-idle-time=-1 --disallow-exit=yes --system -F /snap/pulseaudio/65/etc/pulse/default.pa -p /snap/pulseaudio/65/usr/lib/pulse-8.0/modules -n
             ├─snap.ubuntu-frame.daemon.service 
             │ ├─8742 /bin/bash /snap/ubuntu-frame/8499/bin/run-daemon /snap/ubuntu-frame/8499/bin/run-frame /snap/ubuntu-frame/8499/bin/graphics-core22-wrapper /snap/ubuntu-frame/8499/usr/local/bin/frame
             │ └─8761 snapctl get display
             ├─wpa_supplicant.service 
             │ └─1343 /sbin/wpa_supplicant -u -s -O /run/wpa_supplicant
             ├─systemd-journald.service 
             │ └─546 /lib/systemd/systemd-journald
             ├─ssh.service 
             │ └─1523 sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startups
             ├─system-serial\x2dconsole\x2dconf.slice 
             │ └─serial-console-conf@ttyS0.service 
             │   └─1807 /bin/bash /usr/share/subiquity/console-conf-wrapper --serial
             ├─snapd.service 
             │ └─8568 /snap/snapd/20671/usr/lib/snapd/snapd
             ├─snap.azure-iot-edge.docker-proxy.service 
             │ ├─1512 /bin/sh /snap/azure-iot-edge/26/bin/socat.sh
             │ └─1729 /snap/azure-iot-edge/26/usr/bin/socat UNIX-LISTEN:/var/snap/azure-iot-edge/common/docker-proxy.sock,reuseaddr,fork,user=snap_aziotedge,group=snap_aziotedge UNIX-CONNECT:/var/run/docker.sock
             ├─system-console\x2dconf.slice 
             │ └─console-conf@tty1.service 
             │   └─1808 /bin/bash /usr/share/subiquity/console-conf-wrapper
             ├─systemd-resolved.service 
             │ └─676 /lib/systemd/systemd-resolved
             ├─snap.azure-iot-edge.aziot-edged.service 
             │ ├─1511 /bin/bash /snap/azure-iot-edge/26/snap/command-chain/handle-exit-status-153.sh /snap/azure-iot-edge/26/snap/command-chain/make-socket-directory.sh /snap/azure-iot-edge/26/snap/command-chain/drop-privileges.sh /snap/azure-iot-edge/26/usr/libexec/aziot/aziot-edged
             │ ├─1731 /bin/sh /snap/azure-iot-edge/26/snap/command-chain/drop-privileges.sh /snap/azure-iot-edge/26/usr/libexec/aziot/aziot-edged
             │ └─1745 /snap/azure-iot-edge/26/usr/libexec/aziot/aziot-edged
             ├─dbus.service 
             │ ├─1331 @dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only
             │ └─4350 /usr/libexec/netplan/netplan-dbus
             ├─systemd-timesyncd.service 
             │ └─677 /lib/systemd/systemd-timesyncd
             ├─snap.wpe-webkit-mir-kiosk.restart-watcher.service 
             │ ├─1518 /bin/sh -e /snap/wpe-webkit-mir-kiosk/100/bin/watcher
             │ └─8802 sleep 1
             ├─snap.wpe-webkit-mir-kiosk.daemon.service 
             │ ├─1517 /bin/sh /snap/wpe-webkit-mir-kiosk/100/bin/wayland-launch /snap/wpe-webkit-mir-kiosk/100/bin/set-arch-triplet /snap/wpe-webkit-mir-kiosk/100/bin/gio-updater /snap/wpe-webkit-mir-kiosk/100/bin/launch-wpe
             │ └─3129 inotifywait --event create /run/user/0
             ├─systemd-logind.service 
             │ └─1340 /lib/systemd/systemd-logind
             └─snap.network-manager.networkmanager.service 
               └─1514 /snap/network-manager/873/usr/sbin/NetworkManager --config-dir=/var/snap/network-manager/873/conf.d/ --config=/var/snap/network-manager/873/NetworkManager.conf --log-level=INFO --no-daemon

Any ideas of what to look at or to try?

seems to be working?

I just took the systemctl status while it was trying to start snapd. It goes through cycles here, but it doesn’t reach successful running state (see logs).

I see, yes that’s possible. Can you try and run lsof /var/lib/snapd/state.lock ? It should then report which processes are holding the lock. It would be very much unexpected of something else than snapd was holding that lock. Once you have the process IDs, please run cat /proc/<pid>/cgroup for each and paste all the outputs (including lsof).

lsof is not found in Ubuntu Core and I cannot install anything without snap xD

It is possible that I rebooted the device while it was trying to do something, maybe that caused it to lock?

Would it be possible to recover from such a lock?

I suspect there may already be an instance of snapd running. Can you run systemctl stop snapd and then ps -ef|grep [s]napd, does snapd show up in the output?

Hey!

I wanted to try your suggestion today, but it seems like the device has failed completely… So maybe it was just hardware failure.

Gonna try to setup again and see what happens, thanks!