Firstly, I’d like to apologise I tried to add the logs but apparently they contain links and I cannot post more than 2 as im a new user, any issues with formatting, and a title lacking proper description, but I am quite unsure of my issue here. This happened the other night, and again this morning, around the same time.
I’m running a low resource(1 ocpu and 1gb ram) oracle e2 instance, it’s enough for what I need, but, sometimes, the following occurs, this causes my running node processes to stop, and I can no longer ssh into my instance. The only way to remedy is to reboot the instance via the oracle console.
My first thought was that its happening during apt daily update, as it always follows shortly after that, but, the logs for unattended-upgrades shows there have been none on either occurrence, so im unsure.
Once it reaches the failed to start snapd daemon, scheduled start… it loops through this indefinitely until I reboot my server remotely.
I’m really confused as to what’s happening and why. Snapd starts no problem on reboot, and manually. If anyone has the slightest idea and can point me in the right direction that would be greatly appreciated.
I cant provide full logs but I will provide what seems important to me, though if further info will help, please let me know:
Aug 9 02:08:48 instance-20220802-1400 systemd[1]: Starting Daily apt download activities...
Aug 9 02:14:38 instance-20220802-1400 systemd[1]: snapd.service: Watchdog timeout (limit 5min)!
Aug 9 02:15:11 instance-20220802-1400 systemd[1]: snapd.service: Killing process 880 (snapd) with signal SIGABRT.
Aug 9 02:15:13 instance-20220802-1400 snapd[880]: SIGABRT: abort
Aug 9 02:15:16 instance-20220802-1400 snapd[880]: PC=0x55f2d601e35d m=2 sigcode=0
Aug 9 02:15:19 instance-20220802-1400 snapd[880]: goroutine 0 [idle]:
Followed by a load of goroutines
Inbetween which we get this:
Aug 09 02:15:11 instance-20220802-1400 systemd[1]: snapd.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Then after a couple more routines, this:
Aug 09 02:15:11 instance-20220802-1400 systemd[1]: snapd.service: Failed with result 'watchdog'.
The goroutines then continue
Followed by:
Aug 9 02:18:38 instance-20220802-1400 snapd[880]: rax 0xfffffffffffffffc
Aug 9 02:18:38 instance-20220802-1400 snapd[880]: rbx 0x4e20
Aug 9 02:18:38 instance-20220802-1400 snapd[880]: rcx 0x55f2d601e35d
Aug 9 02:18:38 instance-20220802-1400 snapd[880]: rdx 0x0
Aug 9 02:18:38 instance-20220802-1400 snapd[880]: rdi 0x7fddcc6b4d38
Aug 9 02:18:38 instance-20220802-1400 snapd[880]: rsi 0x0
Aug 9 02:18:38 instance-20220802-1400 snapd[880]: rbp 0x7fddcc6b4d48
Aug 9 02:18:38 instance-20220802-1400 snapd[880]: rsp 0x7fddcc6b4d38
Aug 9 02:18:38 instance-20220802-1400 snapd[880]: r8 0xc00025fc80
Aug 9 02:18:38 instance-20220802-1400 snapd[880]: r9 0x0
Aug 9 02:18:38 instance-20220802-1400 snapd[880]: r10 0x7ffdd9570080
Aug 9 02:18:38 instance-20220802-1400 snapd[880]: r11 0x212
Aug 9 02:18:38 instance-20220802-1400 snapd[880]: r12 0x7fddcc6b5640
Aug 9 02:18:38 instance-20220802-1400 snapd[880]: r13 0x16
Aug 9 02:18:38 instance-20220802-1400 snapd[880]: r14 0x7fddce9be850
Aug 9 02:18:38 instance-20220802-1400 snapd[880]: r15 0x7ffdd95127b0
Aug 9 02:18:38 instance-20220802-1400 snapd[880]: rip 0x55f2d601e35d
Aug 9 02:18:38 instance-20220802-1400 snapd[880]: rflags 0x212
Aug 9 02:18:38 instance-20220802-1400 snapd[880]: cs 0x33
Aug 9 02:18:38 instance-20220802-1400 snapd[880]: fs 0x0
Aug 9 02:18:38 instance-20220802-1400 snapd[880]: gs 0x0
Aug 9 02:18:38 instance-20220802-1400 systemd[1]: snapd.service: Consumed 1min 1.143s CPU time.
Aug 9 02:18:41 instance-20220802-1400 systemd[1]: Starting Snap Daemon...
Aug 9 02:18:44 instance-20220802-1400 systemd[1]: snapd.service: start operation timed out. Terminating.
Aug 9 02:18:47 instance-20220802-1400 systemd[1]: snapd.service: Failed with result 'timeout'.
Aug 9 02:18:50 instance-20220802-1400 systemd[1]: Failed to start Snap Daemon.
Aug 9 02:18:52 instance-20220802-1400 systemd[1]: snapd.service: Scheduled restart job, restart counter is at 2.
Aug 9 02:18:55 instance-20220802-1400 systemd[1]: Stopped Snap Daemon.
Aug 9 02:18:57 instance-20220802-1400 systemd[1]: Starting Snap Daemon...
Aug 9 02:19:00 instance-20220802-1400 systemd[1]: snapd.service: start operation timed out. Terminating.
Aug 9 02:19:03 instance-20220802-1400 systemd[1]: snapd.service: Failed with result 'timeout'.
Aug 9 02:19:04 instance-20220802-1400 systemd[1]: Failed to start Snap Daemon.
Aug 9 02:19:05 instance-20220802-1400 systemd[1]: snapd.service: Scheduled restart job, restart counter is at 3.
Aug 9 02:19:05 instance-20220802-1400 systemd[1]: Stopped Snap Daemon.
Aug 9 02:19:05 instance-20220802-1400 systemd[1]: Starting Snap Daemon...
Aug 9 02:19:34 instance-20220802-1400 snapd[11470]: AppArmor status: apparmor is enabled and all features are available
Aug 9 02:19:50 instance-20220802-1400 systemd[1]: snapd.service: start operation timed out. Terminating.
Aug 9 02:19:52 instance-20220802-1400 systemd[1]: snapd.service: Failed with result 'timeout'.
Aug 9 02:19:53 instance-20220802-1400 systemd[1]: Failed to start Snap Daemon.
This then keeps looping infinitely until I reboot the system.
Comparing this to the previous night where I had no such issue, I can see that on nights where it didnt occur, apt daily download notifies it completed, whereas it didnt today on the nights of the errors.
Later that would be followed by other timed services, including one of a snap application:
Aug 8 02:32:12 instance-20220802-1400 systemd[1]: Starting Service for snap application certbot.renew...
Aug 8 02:32:22 instance-20220802-1400 systemd[1]: snap.certbot.renew.service: Deactivated successfully.
Aug 8 02:32:22 instance-20220802-1400 systemd[1]: Finished Service for snap application certbot.renew.
Which obviously didnt occur this morning when it hit the error. So my hunch remains that it is to do with apt daily download, but I can’t be certain. Is this due to lack of resources? Again, any ideas on the cause, or how to fix it, would be greatly appreciated.`