LXC: snaps don't update

Yes, thanks for pinging. I will try to catch up with @stgraber tomorrow.

I’m happy to chat about it in person, but I suspect I’ll sound like a broken record, basically repeating what I said in https://discuss.linuxcontainers.org/t/snapd-cant-remove-old-revisions-when-running-inside-lxd/452

Sounded like @zyga-snapd had a branch which was getting snap-confine to attempt to fix this, though I’m not sure how exactly that would work given that systemd would still be mounting those snaps automatically on boot, quite possibly much before snap-confine itself is called by the first snap starting. Unless there’s some clever systemd dependency ordering going on there somehow?

I remember that my original suggestion for this was to have a snap.mount unit which would have systemd itself do the bind-mount and MS_SLAVE remount of /snap, doing that would have systemd properly order its mount units, guaranteeing that snap.mount is processed before any other directory underneath it.

I don’t remember if you can have the systemd unit declare both the bind-mount + MS_SLAVE remount in one go, but if not, this should be achievable by using a post-start action on the unit, to have it perform the remount.

I did some attempt but I ran into issues with either systemd or with a security review when trying to work around deficiencies in systemd.

The crux of the limitation was indeed that /snap mount unit is not enough as there’s no way to apply MS_SLAVE this way. I will try your suggestion to have a post-start action that changes sharing.

As one annoying limitation FUSE mounts are not reliably represented in /proc/self/mountinfo so we cannot unmount and remount them to fix something. We must ask systemd to do that but this is too much power to wield from snap-confine. (this is what my earlier branch attempted).

Did this approach work?

@kyrofa no, not really; we discussed this with @mvo today and there’s another attempt in https://github.com/snapcore/snapd/pull/4517

I’m afraid this may not be fixed, or perhaps there’s another problem. I’m using candidate in LXD:

$ snap version
snap    2.31
snapd   2.31
series  16
ubuntu  16.04
kernel  4.4.0-112-generic

Trying to remove a snap I get this:

$ sudo snap remove nextcloud
2018-02-17T18:08:35Z ERROR cannot remove snap file "nextcloud", will retry in 3 mins: [stop
snap-nextcloud-5132.mount] failed with exit status 1: Job for snap-nextcloud-5132.mount failed. See
"systemctl status snap-nextcloud-5132.mount" and "journalctl -xe" for details.

Remove snap "nextcloud" (5132) from the system                                                        .^C
ubuntu@nextcloud-proxy-test:~$ snap changes
ID   Status  Spawn                 Ready                 Summary
1    Done    2018-02-17T17:40:32Z  2018-02-17T17:40:32Z  Initialize system state
2    Done    2018-02-17T17:42:37Z  2018-02-17T17:43:01Z  Install "core" snap from "candidate" channel
3    Done    2018-02-17T17:42:37Z  2018-02-17T17:42:40Z  Initialize device
4    Done    2018-02-17T17:43:15Z  2018-02-17T17:44:00Z  Install "nextcloud" snap
5    Done    2018-02-17T17:47:04Z  2018-02-17T17:47:06Z  Change configuration of "nextcloud" snap
6    Doing   2018-02-17T18:07:57Z  -                     Remove "nextcloud" snap

ubuntu@nextcloud-proxy-test:~$ snap change 6
Status  Spawn                 Ready                 Summary
Done    2018-02-17T18:07:57Z  2018-02-17T18:08:33Z  Stop snap "nextcloud" services
Done    2018-02-17T18:07:57Z  2018-02-17T18:08:33Z  Run remove hook of "nextcloud" snap if present
Done    2018-02-17T18:07:57Z  2018-02-17T18:08:33Z  Remove aliases for snap "nextcloud"
Done    2018-02-17T18:07:57Z  2018-02-17T18:08:34Z  Make snap "nextcloud" unavailable to the system
Done    2018-02-17T18:07:57Z  2018-02-17T18:08:34Z  Remove security profile for snap "nextcloud" (5132)
Done    2018-02-17T18:07:57Z  2018-02-17T18:08:34Z  Remove data for snap "nextcloud" (5132)
Doing   2018-02-17T18:07:57Z  -                     Remove snap "nextcloud" (5132) from the system
Do      2018-02-17T18:07:57Z  -                     Discard interface connections for snap "nextcloud" (5132)

......................................................................
Remove snap "nextcloud" (5132) from the system

2018-02-17T18:08:35Z ERROR cannot remove snap file "nextcloud", will retry in 3 mins: [stop snap-nextcloud-5132.mount] failed with exit status 1: Job for snap-nextcloud-5132.mount failed. See "systemctl status snap-nextcloud-5132.mount" and "journalctl -xe" for details.


ubuntu@nextcloud-proxy-test:~$ systemctl status snap-nextcloud-5132.mount
● snap-nextcloud-5132.mount - Mount unit for nextcloud
   Loaded: loaded (/proc/self/mountinfo; enabled; vendor preset: enabled)
   Active: active (mounted) (Result: exit-code) since Sat 2018-02-17 18:08:35 UTC; 42s ago
    Where: /snap/nextcloud/5132
     What: squashfuse
  Process: 12833 ExecUnmount=/bin/umount /snap/nextcloud/5132 (code=exited, status=32)
    Tasks: 1
   Memory: 1.0M
      CPU: 11.157s
   CGroup: /system.slice/snap-nextcloud-5132.mount
           └─8363 squashfuse /var/lib/snapd/snaps/nextcloud_5132.snap /snap/nextcloud/5132 -o ro,nodev,

Feb 17 17:43:50 nextcloud-proxy-test systemd[1]: Mounting Mount unit for nextcloud...
Feb 17 17:43:50 nextcloud-proxy-test systemd[1]: Mounted Mount unit for nextcloud.
Feb 17 18:08:35 nextcloud-proxy-test systemd[1]: Unmounting Mount unit for nextcloud...
Feb 17 18:08:35 nextcloud-proxy-test umount[12833]: umount: /snap/nextcloud/5132: not mounted
Feb 17 18:08:35 nextcloud-proxy-test systemd[1]: snap-nextcloud-5132.mount: Mount process exited, code=
Feb 17 18:08:35 nextcloud-proxy-test systemd[1]: Failed unmounting Mount unit for nextcloud.

You will need the 2.31-deb based package to get the fix. I. heard that one is coming out soon though.

The snapd 2.31.1 debs release are in *-proposed - install them from there for testing.

1 Like

This issue should be fixed everywhere now. Please post comments in case you are affected again.

1 Like

Thank you for the fix! I’m really happy to be able to use snaps on my servers again (everything runs inside LXD). Thank you @kyrofa for keeping this issue alive. I’m surprised that there is so few snapd+lxd users out there. We need to fix this :slight_smile:

While I looked through the PR:s I noticed a comment from @stgraber at https://github.com/snapcore/snapd/pull/4560#discussion_r169230231 and I concur, this test will not catch the bug described in this thread. I suggest the test is updated to add a reboot to prevent future regressions.

1 Like