Snapd issue - auto update? restart failure? watchdog? daily apt download

ish4dowfiend · August 9, 2022, 7:27am

Firstly, I’d like to apologise I tried to add the logs but apparently they contain links and I cannot post more than 2 as im a new user, any issues with formatting, and a title lacking proper description, but I am quite unsure of my issue here. This happened the other night, and again this morning, around the same time.

I’m running a low resource(1 ocpu and 1gb ram) oracle e2 instance, it’s enough for what I need, but, sometimes, the following occurs, this causes my running node processes to stop, and I can no longer ssh into my instance. The only way to remedy is to reboot the instance via the oracle console.

My first thought was that its happening during apt daily update, as it always follows shortly after that, but, the logs for unattended-upgrades shows there have been none on either occurrence, so im unsure.

Once it reaches the failed to start snapd daemon, scheduled start… it loops through this indefinitely until I reboot my server remotely.

I’m really confused as to what’s happening and why. Snapd starts no problem on reboot, and manually. If anyone has the slightest idea and can point me in the right direction that would be greatly appreciated.

I cant provide full logs but I will provide what seems important to me, though if further info will help, please let me know:

Aug  9 02:08:48 instance-20220802-1400 systemd[1]: Starting Daily apt download activities...
Aug  9 02:14:38 instance-20220802-1400 systemd[1]: snapd.service: Watchdog timeout (limit 5min)!

Aug  9 02:15:11 instance-20220802-1400 systemd[1]: snapd.service: Killing process 880 (snapd) with signal SIGABRT.
Aug  9 02:15:13 instance-20220802-1400 snapd[880]: SIGABRT: abort
Aug  9 02:15:16 instance-20220802-1400 snapd[880]: PC=0x55f2d601e35d m=2 sigcode=0
Aug  9 02:15:19 instance-20220802-1400 snapd[880]: goroutine 0 [idle]:

Followed by a load of goroutines

Inbetween which we get this:

Aug 09 02:15:11 instance-20220802-1400 systemd[1]: snapd.service: Main process exited, code=exited, status=2/INVALIDARGUMENT

Then after a couple more routines, this:

Aug 09 02:15:11 instance-20220802-1400 systemd[1]: snapd.service: Failed with result 'watchdog'.

The goroutines then continue

Followed by:

Aug  9 02:18:38 instance-20220802-1400 snapd[880]: rax    0xfffffffffffffffc
Aug  9 02:18:38 instance-20220802-1400 snapd[880]: rbx    0x4e20 
Aug  9 02:18:38 instance-20220802-1400 snapd[880]: rcx    0x55f2d601e35d
Aug  9 02:18:38 instance-20220802-1400 snapd[880]: rdx    0x0
Aug  9 02:18:38 instance-20220802-1400 snapd[880]: rdi    0x7fddcc6b4d38
Aug  9 02:18:38 instance-20220802-1400 snapd[880]: rsi    0x0
Aug  9 02:18:38 instance-20220802-1400 snapd[880]: rbp    0x7fddcc6b4d48
Aug  9 02:18:38 instance-20220802-1400 snapd[880]: rsp    0x7fddcc6b4d38
Aug  9 02:18:38 instance-20220802-1400 snapd[880]: r8     0xc00025fc80
Aug  9 02:18:38 instance-20220802-1400 snapd[880]: r9     0x0
Aug  9 02:18:38 instance-20220802-1400 snapd[880]: r10    0x7ffdd9570080
Aug  9 02:18:38 instance-20220802-1400 snapd[880]: r11    0x212
Aug  9 02:18:38 instance-20220802-1400 snapd[880]: r12    0x7fddcc6b5640
Aug  9 02:18:38 instance-20220802-1400 snapd[880]: r13    0x16
Aug  9 02:18:38 instance-20220802-1400 snapd[880]: r14    0x7fddce9be850
Aug  9 02:18:38 instance-20220802-1400 snapd[880]: r15    0x7ffdd95127b0
Aug  9 02:18:38 instance-20220802-1400 snapd[880]: rip    0x55f2d601e35d
Aug  9 02:18:38 instance-20220802-1400 snapd[880]: rflags 0x212
Aug  9 02:18:38 instance-20220802-1400 snapd[880]: cs     0x33
Aug  9 02:18:38 instance-20220802-1400 snapd[880]: fs     0x0
Aug  9 02:18:38 instance-20220802-1400 snapd[880]: gs     0x0
Aug  9 02:18:38 instance-20220802-1400 systemd[1]: snapd.service: Consumed 1min 1.143s CPU time.
Aug  9 02:18:41 instance-20220802-1400 systemd[1]: Starting Snap Daemon...
Aug  9 02:18:44 instance-20220802-1400 systemd[1]: snapd.service: start operation timed out. Terminating.
Aug  9 02:18:47 instance-20220802-1400 systemd[1]: snapd.service: Failed with result 'timeout'.
Aug  9 02:18:50 instance-20220802-1400 systemd[1]: Failed to start Snap Daemon.
Aug  9 02:18:52 instance-20220802-1400 systemd[1]: snapd.service: Scheduled restart job, restart counter is at 2.
Aug  9 02:18:55 instance-20220802-1400 systemd[1]: Stopped Snap Daemon.
Aug  9 02:18:57 instance-20220802-1400 systemd[1]: Starting Snap Daemon...
Aug  9 02:19:00 instance-20220802-1400 systemd[1]: snapd.service: start operation timed out. Terminating.
Aug  9 02:19:03 instance-20220802-1400 systemd[1]: snapd.service: Failed with result 'timeout'.
Aug  9 02:19:04 instance-20220802-1400 systemd[1]: Failed to start Snap Daemon.
Aug  9 02:19:05 instance-20220802-1400 systemd[1]: snapd.service: Scheduled restart job, restart counter is at 3.
Aug  9 02:19:05 instance-20220802-1400 systemd[1]: Stopped Snap Daemon.
Aug  9 02:19:05 instance-20220802-1400 systemd[1]: Starting Snap Daemon...
Aug  9 02:19:34 instance-20220802-1400 snapd[11470]: AppArmor status: apparmor is enabled and all features are available
Aug  9 02:19:50 instance-20220802-1400 systemd[1]: snapd.service: start operation timed out. Terminating.
Aug  9 02:19:52 instance-20220802-1400 systemd[1]: snapd.service: Failed with result 'timeout'.
Aug  9 02:19:53 instance-20220802-1400 systemd[1]: Failed to start Snap Daemon.

This then keeps looping infinitely until I reboot the system.

Comparing this to the previous night where I had no such issue, I can see that on nights where it didnt occur, apt daily download notifies it completed, whereas it didnt today on the nights of the errors.

Later that would be followed by other timed services, including one of a snap application:

Aug  8 02:32:12 instance-20220802-1400 systemd[1]: Starting Service for snap application certbot.renew...
Aug  8 02:32:22 instance-20220802-1400 systemd[1]: snap.certbot.renew.service: Deactivated successfully.
Aug  8 02:32:22 instance-20220802-1400 systemd[1]: Finished Service for snap application certbot.renew.

Which obviously didnt occur this morning when it hit the error. So my hunch remains that it is to do with apt daily download, but I can’t be certain. Is this due to lack of resources? Again, any ideas on the cause, or how to fix it, would be greatly appreciated.`

scubadrew · March 15, 2023, 5:17pm

I’m having this issue on Ubuntu 20.04. Did you ever find a solution to this?

ish4dowfiend · March 15, 2023, 6:07pm

Hey, ummm, it was a while back, if I remember correctly, it was just an issue with resources, fortunately it was a pretty fresh server and was able to move over to a different server with more ram and I didn’t come across the issue again.

ernestl · March 16, 2023, 9:01am

Potentially related to: Snapd killed by watchdog in VM with low specs