Unable to refresh snap: Failed to get unit file state for snap.lxd.activate.service: No such file or directory

Reference from: https://github.com/lxc/lxd/issues/7964

This error seems to be related to snap.

Environment details:

dpkg -l | grep snap
ii  snap                                   2013-11-29-8                                    amd64        location of genes from DNA sequence with hidden markov model
ii  snapd                                  2.46.1+18.04                                    amd64        Daemon and tooling that enable snap packages
* Distribution: Ubuntu
* Distribution version: 18.04 LTS
* The output of "lxc info" or if that fails:
  * Kernel version: 4.15.0-118-generic
  * LXC version: 4.0.4
  * LXD version: 4.6
  * Storage backend in use: zfs

Issue description

LXD stopped working suddenly. Unable to launch lxc command. Snap refresh fails:

root@VM1:~# snap refresh lxd
error: cannot perform the following tasks:
* Stop snap "lxd" services ([--root / is-enabled snap.lxd.daemon.service] failed with exit status 1: Failed to get unit file state for snap.lxd.daemon.service: No such file or directory
)

Also unable to remove LXD

root@VM1:~# snap remove lxd
error: cannot perform the following tasks:

* Stop snap "lxd" services ([--root / is-enabled snap.lxd.daemon.service] failed with exit status 1: Failed to get unit file state for snap.lxd.daemon.service: No such file or directory
)

Steps to reproduce

  1. snap refresh lxd
  2. Error message being showed

Can you show systemctl -a | grep snap.*lxd?

As well as systemctl status snap.lxd.daemon.service?

root@VM1:~# systemctl -a | grep snap.*lxd
  snap-lxd-17299.mount                                                                     loaded    active   mounted   Mount unit for lxd, revision 17299                                   
  snap-lxd-17320.mount                                                                     loaded    active   mounted   Mount unit for lxd, revision 17320                                   
● snap-lxd-17497.mount                                                                     not-found inactive dead      snap-lxd-17497.mount                                                 
● snap.lxd.daemon.service                                                                  not-found inactive dead      snap.lxd.daemon.service                                              
  snap.lxd.daemon.unix.socket
                                                              loaded    inactive dead      Socket unix for snap application lxd.daemon                          
root@VM1:~# systemctl status snap.lxd.daemon.service
Unit snap.lxd.daemon.service could not be found.

Ok, so you’re supposed to have:

  • snap.lxd.activate.service
  • snap.lxd.daemon.service
  • snap.lxd.daemon.unix.socket

But in your case you seem to only have the unix socket, so this definitely feels like a snapd bug…

In the past when hitting similar situations, we’ve seen some amount of success in doing:

  • snap stop lxd
  • snap refresh lxd --candidate
  • snap refresh lxd --stable

But in your case it’s failing on stop so I don’t know how successful this approach would be…

Oh and could you show snap changes and snap change ID for the latest refresh?
This may help the snapd team a bit.

@mvo this isn’t the first time we’re getting reports of systemd units going missing or being incorrectly disabled though we’re yet to get a stable reproducer for such cases.

snap stop fails:

root@VM1:~# snap stop lxd
error: cannot perform the following tasks:
- stop of [lxd.activate lxd.daemon] (# systemctl stop snap.lxd.activate.service snap.lxd.daemon.service
Failed to stop snap.lxd.activate.service: Unit snap.lxd.activate.service not loaded.
Failed to stop snap.lxd.daemon.service: Unit snap.lxd.daemon.service not loaded.
)
- stop of [lxd.activate lxd.daemon] (exit status 5)
root@VM1:~# snap changes
ID   Status  Spawn                   Ready                   Summary
194  Error   yesterday at 04:10 BST  yesterday at 08:41 BST  Auto-refresh snap "lxd"
195  Done    yesterday at 08:26 BST  yesterday at 08:26 BST  Refresh all snaps: no updates
196  Error   yesterday at 08:43 BST  yesterday at 08:43 BST  Auto-refresh snap "lxd"
197  Error   yesterday at 08:44 BST  yesterday at 08:44 BST  Refresh snap "lxd"
198  Error   yesterday at 08:44 BST  yesterday at 08:44 BST  Refresh "lxd" snap
199  Error   yesterday at 08:44 BST  yesterday at 08:45 BST  Refresh "lxd" snap
200  Error   yesterday at 08:46 BST  yesterday at 08:46 BST  Refresh "lxd" snap from "candidate" channel
201  Error   yesterday at 08:47 BST  yesterday at 08:47 BST  Refresh "lxd" snap
202  Error   yesterday at 08:51 BST  yesterday at 08:51 BST  Refresh "lxd" snap
203  Error   yesterday at 09:02 BST  yesterday at 09:02 BST  Refresh "lxd" snap
204  Error   yesterday at 09:03 BST  yesterday at 09:03 BST  Remove "lxd" snap
205  Error   yesterday at 09:04 BST  yesterday at 09:04 BST  Revert "lxd" snap
206  Error   yesterday at 09:04 BST  yesterday at 09:04 BST  Running service command
207  Done    yesterday at 09:09 BST  yesterday at 09:09 BST  Switch "lxd" snap to channel "stable"
208  Error   yesterday at 09:09 BST  yesterday at 09:09 BST  Refresh snap "lxd"
209  Done    yesterday at 09:09 BST  yesterday at 09:09 BST  Switch "lxd" snap to channel "edge"
210  Error   yesterday at 09:10 BST  yesterday at 09:10 BST  Refresh snap "lxd"
211  Done    yesterday at 09:10 BST  yesterday at 09:10 BST  Switch "lxd" snap to channel "stable"
212  Error   yesterday at 09:10 BST  yesterday at 09:10 BST  Refresh snap "lxd"
213  Error   yesterday at 09:11 BST  yesterday at 09:11 BST  Remove "lxd" snap
214  Error   yesterday at 09:15 BST  yesterday at 09:15 BST  Refresh snap "lxd"
215  Error   yesterday at 09:38 BST  yesterday at 09:38 BST  Refresh "lxd" snap
216  Error   yesterday at 09:39 BST  yesterday at 09:39 BST  Remove "lxd" snap
217  Error   yesterday at 14:00 BST  yesterday at 14:00 BST  Refresh "lxd" snap
218  Error   yesterday at 14:32 BST  yesterday at 14:32 BST  Refresh snap "lxd"
219  Error   yesterday at 17:30 BST  yesterday at 17:30 BST  Auto-refresh snap "lxd"
220  Error   yesterday at 18:45 BST  yesterday at 18:45 BST  Auto-refresh snap "lxd"
221  Error   today at 04:05 BST      today at 04:05 BST      Auto-refresh snap "lxd"
222  Error   today at 07:06 BST      today at 07:06 BST      Running service command
223  Error   today at 07:10 BST      today at 07:10 BST      Refresh "lxd" snap

root@VM1:~# snap change 223
Status  Spawn               Ready               Summary
Done    today at 07:10 BST  today at 07:10 BST  Ensure prerequisites for "lxd" are available
Undone  today at 07:10 BST  today at 07:10 BST  Download snap "lxd" (17497) from channel "latest/stable"
Done    today at 07:10 BST  today at 07:10 BST  Fetch and check assertions for snap "lxd" (17497)
Undone  today at 07:10 BST  today at 07:10 BST  Mount snap "lxd" (17497)
Undone  today at 07:10 BST  today at 07:10 BST  Run pre-refresh hook of "lxd" snap if present
Error   today at 07:10 BST  today at 07:10 BST  Stop snap "lxd" services
Hold    today at 07:10 BST  today at 07:10 BST  Remove aliases for snap "lxd"
Hold    today at 07:10 BST  today at 07:10 BST  Make current revision for snap "lxd" unavailable
Hold    today at 07:10 BST  today at 07:10 BST  Copy snap "lxd" data
Hold    today at 07:10 BST  today at 07:10 BST  Setup snap "lxd" (17497) security profiles
Hold    today at 07:10 BST  today at 07:10 BST  Make snap "lxd" (17497) available to the system
Hold    today at 07:10 BST  today at 07:10 BST  Automatically connect eligible plugs and slots of snap "lxd"
Hold    today at 07:10 BST  today at 07:10 BST  Set automatic aliases for snap "lxd"
Hold    today at 07:10 BST  today at 07:10 BST  Setup snap "lxd" aliases
Hold    today at 07:10 BST  today at 07:10 BST  Run post-refresh hook of "lxd" snap if present
Hold    today at 07:10 BST  today at 07:10 BST  Start snap "lxd" (17497) services
Hold    today at 07:10 BST  today at 07:10 BST  Remove data for snap "lxd" (17299)
Hold    today at 07:10 BST  today at 07:10 BST  Remove snap "lxd" (17299) from the system
Hold    today at 07:10 BST  today at 07:10 BST  Clean up "lxd" (17497) install
Hold    today at 07:10 BST  today at 07:10 BST  Run configure hook of "lxd" snap if present
Hold    today at 07:10 BST  today at 07:10 BST  Run health check of "lxd" snap
Done    today at 07:10 BST  today at 07:10 BST  Consider re-refresh of "lxd"

......................................................................
Stop snap "lxd" services

2020-10-02T07:10:33+01:00 ERROR [--root / is-enabled snap.lxd.daemon.service] failed with exit status 1: Failed to get unit file state for snap.lxd.daemon.service: No such file or directory

root@VM1:~#

what is the output of snap list

root@VM1:~# snap list
Name    Version    Rev    Tracking       Publisher   Notes
core    16-2.46.1  9993   latest/stable  canonical✓  core
core18  20200724   1885   latest/stable  canonical✓  base
lxd     4.6        17320  latest/stable  canonical✓  -

Is this a bug in the current version of snap? Should I revert back to another version then?

I don’t think it has anything to do with the lxd snap itself, it seems to be snapd issue, and maybe a problem of systemd in-memory state. By trial-and error I found the following sequence of operations to reproduce it on focal with snapd 2.46.1:

snap install lxd
snap stop lxd
snap start lxd
snap refresh --candidate lxd
snap remove lxd --purge

The last one fails in a very similar way to the OP, although I’m getting a few extra errors:

+ snap remove lxd --purge
error: cannot perform the following tasks:
- Stop snap "lxd" services ([--root / enable snap.lxd.daemon.unix.socket] failed with exit status 1: Failed to enable unit, unit snap.lxd.daemon.unix.socket does not exist.
)
- Remove security profile for snap "lxd" (17544) (cannot find installed snap "lxd" at revision 17544: missing file /snap/lxd/17544/meta/snap.yaml)
- Remove data for snap "lxd" (17497) (remove /var/snap/lxd/common/ns/mntns: device or resource busy)
- Disconnect lxd:lxd-support from snapd:lxd-support (snap "lxd" has no "lxd-support" plug)
- Disconnect lxd:system-observe from snapd:system-observe (snap "lxd" has no "system-observe" plug)
- Disconnect lxd:network-bind from snapd:network-bind (snap "lxd" has no "network-bind" plug)
- Disconnect lxd:network from snapd:network (snap "lxd" has no "network" plug)

Similarly, afterwards, snap stop/start fails with:

$ sudo snap stop lxd
error: cannot perform the following tasks:
- stop of [lxd.activate lxd.daemon] (# systemctl stop snap.lxd.activate.service snap.lxd.daemon.service
Failed to stop snap.lxd.activate.service: Unit snap.lxd.activate.service not loaded.
Failed to stop snap.lxd.daemon.service: Unit snap.lxd.daemon.service not loaded.

I’ve reproduced it a bunch of times; every time restarting the system “fixes it” and “snap remove lxd” then succeeds. I’m investigating this further to find the root cause.

I proposed a reproducer (for at least “213 Error yesterday at 09:11 BST yesterday at 09:11 BST Remove “lxd” snap” case from the above log I think) here https://github.com/snapcore/snapd/pull/9468 and still investigating.

Is there a temporary workaround available to fix this issue?

There were a couple of issues that contributed to the problem, although I’m not 100% sure they match the exact scenario that you reported, but what I found:

  • lxd snap had a bug in remove hook and wouldn’t correctly umount its namespaces. This was fixed in lxd from edge channel ~2 days ago.
  • this led to failures during snap remove, when snapd couldn’t clean up lxd directories when removing lxd snap. There were 3 problems around that that I explained in https://bugs.launchpad.net/snapd/+bug/1899614, they all have fixes (2 of them merged in snapd master), 1 in progress, but it will take a while till they become available in another snapd release.
  • once lxd got into the “bad” state because of the above, no other operation related to that snap (such as refresh) would work.

To recover from this, try to:

  • manually unmount /var/snap/lxd/common/ns/shmounts, /var/snap/lxd/common/ns/mntns , /var/snap/lxd/common/ns (if mounted) or restart your box.
  • try ‘snap enable lxd’
  • if that worked, then ‘snap remove --purge lxd’