Unexplained mount failure, protocol error, what we know so far


#1

Hey everyone.

Once in a while we see a test failure where we ask systemd to mount a squashfs, systemd asks the kernel and the kernel says “yeah, mounted” but the mount point doesn’t show up in the mountinfo table. Systemd cross-checks that and returns a protocol error.

Systemd doing the cross-checking: https://github.com/systemd/systemd/blob/5300857701ff5e87169f3c90c6b570c750379dfb/src/core/mount.c#L1286

Kernel, snapd and systemd messages when this happens:

May 30 08:18:01 arch snapd[10146]: May 30 08:17:12 arch kernel: print_req_error: I/O error, dev loop1, sector 0
May 30 08:18:01 arch snapd[10146]: May 30 08:17:13 arch kernel: print_req_error: I/O error, dev loop2, sector 0
May 30 08:18:01 arch snapd[10146]: May 30 08:17:40 arch kernel: print_req_error: I/O error, dev loop1, sector 0
May 30 08:18:01 arch snapd[10146]: May 30 08:17:59 arch kernel: print_req_error: I/O error, dev loop2, sector 0
May 30 08:18:01 arch snapd[10146]: May 30 08:17:59 arch systemd[1]: var-lib-snapd-snap-test\x2dsnapd\x2dcontent\x2dslot-2.mount: Mount process finished, but there is no mount.
May 30 08:18:01 arch snapd[10146]: May 30 08:17:59 arch systemd[1]: var-lib-snapd-snap-test\x2dsnapd\x2dcontent\x2dslot-2.mount: Failed with result 'protocol'.
May 30 08:18:01 arch snapd[10146]: May 30 08:17:59 arch systemd[1]: Failed to mount Mount unit for test-snapd-content-slot, revision 2.
May 30 08:18:01 arch snapd[10146]: May 30 08:17:59 arch kernel: print_req_error: I/O error, dev loop2, sector 0

This was on a kernel version from Arch: 4.16.12-1-ARCH

What I found interesting in this log is that there are I/O errors from more than one loopback device, loop1 and loop2. In this particular case the snap was downloaded from the store and that results in assertion checks that would detect corrupted downloads. The same snap is downloaded over and over and mounted correctly so perhaps there’s some race in the kernel or perhaps the storage in the VM backing this test was just flaky.

Let’s please use this thread to coordinate and collect facts as this happens again.


#2

We do get quite a few of those via the errors.ubuntu.com error tracker. It is is also happening outside of tests.


#3

This is still happening. It seems that arch is more likely affected than other systems.


#4

Another sample:

Jul 03 15:02:21 arch kernel: print_req_error: I/O error, dev loop0, sector 0
Jul 03 15:02:21 arch systemd[1]: var-lib-snapd-snap-core-4995.mount: Failed with result 'exit-code'.
Jul 03 15:02:21 arch systemd[1]: Failed to mount Mount unit for core, revision 4995.
Jul 03 15:02:30 arch kernel: print_req_error: I/O error, dev loop1, sector 0
Jul 03 15:02:37 arch kernel: print_req_error: I/O error, dev loop1, sector 0
Jul 03 15:02:44 arch kernel: print_req_error: I/O error, dev loop2, sector 0
Jul 03 15:02:47 arch systemd[1]: fakestore.service: Failed with result 'exit-code'.
Jul 03 15:02:49 arch kernel: print_req_error: I/O error, dev loop1, sector 0
Jul 03 15:02:56 arch kernel: print_req_error: I/O error, dev loop1, sector 0
Jul 03 15:03:04 arch kernel: print_req_error: I/O error, dev loop2, sector 0
Jul 03 15:03:14 arch kernel: print_req_error: I/O error, dev loop1, sector 0
Jul 03 15:03:17 arch kernel: print_req_error: I/O error, dev loop1, sector 0
Jul 03 15:03:28 arch kernel: print_req_error: I/O error, dev loop1, sector 0
Jul 03 15:03:43 arch kernel: print_req_error: I/O error, dev loop2, sector 0
Jul 03 15:04:38 arch kernel: print_req_error: I/O error, dev loop1, sector 0
Jul 03 15:04:45 arch kernel: print_req_error: I/O error, dev loop3, sector 0
Jul 03 15:04:48 arch kernel: print_req_error: I/O error, dev loop3, sector 0
Jul 03 15:04:59 arch kernel: print_req_error: I/O error, dev loop1, sector 0
Jul 03 15:05:09 arch kernel: print_req_error: I/O error, dev loop1, sector 0
Jul 03 15:05:19 arch kernel: print_req_error: I/O error, dev loop2, sector 0
Jul 03 15:05:36 arch kernel: print_req_error: I/O error, dev loop1, sector 0
Jul 03 15:05:41 arch kernel: print_req_error: I/O error, dev loop1, sector 0
Jul 03 15:05:43 arch kernel: print_req_error: I/O error, dev loop2, sector 0
Jul 03 15:05:45 arch kernel: print_req_error: I/O error, dev loop1, sector 0
Jul 03 15:05:47 arch kernel: print_req_error: I/O error, dev loop1, sector 0
Jul 03 15:06:27 arch io.snapcraft.Settings[6745]: 2018/07/03 15:06:27.608335 userd.go:74: Starting snap userd
Jul 03 15:06:43 arch kernel: print_req_error: I/O error, dev loop1, sector 0
Jul 03 15:07:00 arch systemd[1]: snap.test-snapd-service.service.service: Failed with result 'exit-code'.
Jul 03 15:07:00 arch systemd[1]: Failed to start Service for snap application test-snapd-service.service.
Jul 03 15:07:27 arch kernel: print_req_error: I/O error, dev loop1, sector 0
Jul 03 15:07:42 arch systemd[1]: var-lib-snapd-snap-test\x2dsnapd\x2dcontent\x2dslot-2.mount: Mount process finished, but there is no mount.
Jul 03 15:07:42 arch systemd[1]: var-lib-snapd-snap-test\x2dsnapd\x2dcontent\x2dslot-2.mount: Failed with result 'protocol'.
Jul 03 15:07:42 arch systemd[1]: Failed to mount Mount unit for test-snapd-content-slot, revision 2.
Jul 03 15:07:42 arch kernel: print_req_error: I/O error, dev loop2, sector 0
Jul 03 15:07:42 arch kernel: print_req_error: I/O error, dev loop3, sector 0
 Revision:2 ProblemType:Snap]

#5

What is interesting is that it affects multiple loop devices. Perhaps the squashfs files in the virtual machines are getting corrupted somehow? Still no theory as to why though.


#6

Observed again, this time ubuntu-18.04-64 instance:

Jul 05 14:09:42 jul051352-179760 systemd[1]: snap.test-snapd-service.test-snapd-sigusr2-all-service.service: Failed with result 'exit-code'.
Jul 05 14:09:45 jul051352-179760 kernel: kauditd_printk_skb: 4 callbacks suppressed
Jul 05 14:09:45 jul051352-179760 systemd-udevd[16899]: error opening ATTR{/sys/devices/pci0000:00/0000:00:03.0/virtio0/host0/target0:0:1/0:0:1:0/block/sda/sda15/queue/scheduler} for writing: No such file or directory
Jul 05 14:09:45 jul051352-179760 systemd-udevd[16903]: error opening ATTR{/sys/devices/pci0000:00/0000:00:03.0/virtio0/host0/target0:0:1/0:0:1:0/block/sda/sda1/queue/scheduler} for writing: No such file or directory
Jul 05 14:09:45 jul051352-179760 systemd-udevd[16900]: error opening ATTR{/sys/devices/pci0000:00/0000:00:03.0/virtio0/host0/target0:0:1/0:0:1:0/block/sda/sda14/queue/scheduler} for writing: No such file or directory
Jul 05 14:09:53 jul051352-179760 systemd[1]: snap-test\x2dsnapd\x2dcontrol\x2dconsumer-2.mount: Mount process finished, but there is no mount.
Jul 05 14:09:53 jul051352-179760 systemd[1]: snap-test\x2dsnapd\x2dcontrol\x2dconsumer-2.mount: Failed with result 'protocol'.

Note that there were explicit I/O errors logged, but that may be attributed to different log level or kernel config.


#7

This is what I found so far:

I can reproduce it on fedora 28 and arch linux on google, for those systems I am sharing the kernel version and systemd version.

I can also reproduce it when we install 2 snaps with the following spread task:

summary: Check mount issue
execute: |
    for _ in $(seq 50); do
        snap install test-snapd-tools test-snapd-public
        snap remove test-snapd-tools test-snapd-public
    done

This is the PR to reproduce the error: https://github.com/snapcore/snapd/pull/5544

Results:

System: arch-linux-64

  • snap install test-snapd-tools test-snapd-public
    2018-07-20T16:13:28Z INFO Waiting for var-lib-snapd-snap-test\x2dsnapd\x2dpublic-1.mount to stop.
    error: cannot perform the following tasks:
  • Mount snap “test-snapd-public” (1) ([start var-lib-snapd-snap-test\x2dsnapd\x2dpublic-1.mount] failed with exit status 1: Job for var-lib-snapd-snap-test\x2dsnapd\x2dpublic-1.mount failed.
    See “systemctl status “var-lib-snapd-snap-test\x2dsnapd\x2dpublic-1.mount”” and “journalctl -xe” for details.
    )

Linux arch 4.17.6-1-ARCH #1 SMP PREEMPT Wed Jul 11 19:14:29 UTC 2018 x86_64 GNU/Linux
systemd 239
+PAM -AUDIT -SELINUX -IMA -APPARMOR +SMACK -SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN +PCRE2 default-hierarchy=hybrid

System: fedora-28-64

  • snap remove test-snapd-tools test-snapd-public
    test-snapd-tools removed
    test-snapd-public removed
  • for _ in $(seq 50)
  • snap install test-snapd-tools test-snapd-public
    error: cannot perform the following tasks:
  • Mount snap “test-snapd-tools” (6) ([start var-lib-snapd-snap-test\x2dsnapd\x2dtools-6.mount] failed with exit status 1: Job for var-lib-snapd-snap-test\x2dsnapd\x2dtools-6.mount failed.
    See “systemctl status “var-lib-snapd-snap-test\x2dsnapd\x2dtools-6.mount”” and “journalctl -xe” for details.
    )

Linux jul201435-630697 4.17.2-200.fc28.x86_64 #1 SMP Mon Jun 18 20:09:31 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

systemd 238
+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid


#8

I just saw this on core18:

Jul 23 16:14:14 jul231603-610911 snapd[5486]: Jul 23 16:12:14 localhost initrd: found more than 10% free space on disk, resizing /dev/sda3
Jul 23 16:14:14 jul231603-610911 snapd[5486]: Jul 23 16:12:14 localhost initrd: partition to full disk size, see /run/initramfs/resize-writable.log for details
Jul 23 16:14:14 jul231603-610911 snapd[5486]: Jul 23 16:12:14 localhost initrd: checking filesystem for writable partition
Jul 23 16:14:14 jul231603-610911 snapd[5486]: Jul 23 16:12:14 localhost kernel: EXT4-fs (sda3): couldn't mount as ext3 due to feature incompatibilities
Jul 23 16:14:14 jul231603-610911 snapd[5486]: Jul 23 16:12:14 localhost kernel: EXT4-fs (sda3): couldn't mount as ext2 due to feature incompatibilities
Jul 23 16:14:14 jul231603-610911 snapd[5486]: Jul 23 16:12:14 localhost kernel: EXT4-fs (sda3): couldn't mount as ext3 due to feature incompatibilities
Jul 23 16:14:14 jul231603-610911 snapd[5486]: Jul 23 16:12:14 localhost kernel: EXT4-fs (sda3): couldn't mount as ext2 due to feature incompatibilities
Jul 23 16:14:14 jul231603-610911 snapd[5486]: Jul 23 16:12:14 localhost initrd: mounting /run
Jul 23 16:14:14 jul231603-610911 snapd[5486]: Jul 23 16:12:14 localhost systemd[1]: File /lib/systemd/system/systemd-journald.service:36 configures an IP firewall (IPAddressDeny=any), but the local system does not support BPF/cgroup based firewalling.
Jul 23 16:14:14 jul231603-610911 snapd[5486]: Jul 23 16:12:14 localhost systemd[1]: Proceeding WITHOUT firewalling in effect! (This warning is only shown for the first loaded unit using IP firewalling.)
Jul 23 16:14:14 jul231603-610911 snapd[5486]: Jul 23 16:12:14 localhost systemd[1]: etc-gshadow.mount: Failed to check directory /etc/gshadow: Not a directory
Jul 23 16:14:14 jul231603-610911 snapd[5486]: Jul 23 16:12:14 localhost systemd[1]: etc-passwd.mount: Failed to check directory /etc/passwd: Not a directory
Jul 23 16:14:14 jul231603-610911 snapd[5486]: Jul 23 16:12:14 localhost systemd[1]: etc-shadow.mount: Failed to check directory /etc/shadow: Not a directory
Jul 23 16:14:14 jul231603-610911 snapd[5486]: Jul 23 16:12:14 localhost systemd[1]: etc-group.mount: Failed to check directory /etc/group: Not a directory
Jul 23 16:14:14 jul231603-610911 snapd[5486]: Jul 23 16:12:15 localhost kernel: piix4_smbus 0000:00:01.3: SMBus base address uninitialized - upgrade BIOS or use force_addr=0xaddr
Jul 23 16:14:14 jul231603-610911 snapd[5486]: Jul 23 16:12:15 localhost kernel: intel_rapl: no valid rapl domains found in package 0
Jul 23 16:14:14 jul231603-610911 snapd[5486]: Jul 23 16:12:19 localhost systemd-resolved[1044]: Server returned error NXDOMAIN, mitigating potential DNS violation DVE-2018-0001, retrying transaction with reduced feature level UDP.
Jul 23 16:14:14 jul231603-610911 snapd[5486]: Jul 23 16:12:20 jul231603-610911 systemd-networkd[1041]: Could not set hostname: The name org.freedesktop.PolicyKit1 was not provided by any .service files
Jul 23 16:14:14 jul231603-610911 snapd[5486]: Jul 23 16:12:21 jul231603-610911 sshd[1113]: pam_env(sshd:session): Unable to open env file: /etc/default/locale: No such file or directory
Jul 23 16:14:14 jul231603-610911 snapd[5486]: Jul 23 16:12:21 jul231603-610911 kernel: blk_update_request: I/O error, dev loop3, sector 0
Jul 23 16:14:14 jul231603-610911 snapd[5486]: Jul 23 16:13:09 jul231603-610911 kernel: audit_printk_skb: 24 callbacks suppressed
Jul 23 16:14:14 jul231603-610911 snapd[5486]: Jul 23 16:13:22 jul231603-610911 kernel: audit_printk_skb: 51 callbacks suppressed
Jul 23 16:14:14 jul231603-610911 snapd[5486]: Jul 23 16:13:28 jul231603-610911 kernel: audit_printk_skb: 27 callbacks suppressed
Jul 23 16:14:14 jul231603-610911 snapd[5486]: Jul 23 16:13:49 jul231603-610911 kernel: audit_printk_skb: 12 callbacks suppressed
Jul 23 16:14:14 jul231603-610911 snapd[5486]: Jul 23 16:14:02 jul231603-610911 kernel: audit_printk_skb: 24 callbacks suppressed
Jul 23 16:14:14 jul231603-610911 snapd[5486]: Jul 23 16:14:11 jul231603-610911 kernel: audit_printk_skb: 18 callbacks suppressed
Jul 23 16:14:14 jul231603-610911 snapd[5486]: Jul 23 16:14:13 jul231603-610911 systemd[1]: snap-test\x2dsnapd\x2dtools-6.mount: Mount process finished, but there is no mount.
Jul 23 16:14:14 jul231603-610911 snapd[5486]: Jul 23 16:14:13 jul231603-610911 systemd[1]: snap-test\x2dsnapd\x2dtools-6.mount: Failed with result 'protocol'.
Jul 23 16:14:14 jul231603-610911 snapd[5486]: Jul 23 16:14:13 jul231603-610911 systemd[1]: Failed to mount Mount unit for test-snapd-tools, revision 6.
Jul 23 16:14:14 jul231603-610911 snapd[5486]:  ProcCwd:/ DetectedVirt:kvm DidSnapdReExec:yes SnapdVersion:1337.2.34.2 HostSnapdBuildID:75457388dd9a575b527b132d3585c510affc6964 CoreSnapdBuildID:e5770e1b98e7782a036bfc5e5f26b365edb10f0d Date:Mon Jul 23 16:14:14 2018 ProblemType:Snap]

Interestingly it was preceded by filesystem resize. Still because it happens on all the other systems as well I don’t believe that is the direct cause.


#9

I wonder if any of this is caused by the increased task parallelism we have now. Maybe mounting a squashfs file along with another squashfs file, using libmount is just racy and we somehow end up switching the backing store of a loopback device while it is being mounted.


#10

I just observed this first hand on my Fedora 28 system. What I find interesting is that this system has just a handful of snaps installed. A subsequent snap refresh worked correctly. I was unable to find any more information in the log files.

snap refresh

[zyga@faroe ~]$ sudo snap refresh
error: cannot perform the following tasks:
- Mount snap "skype" (54) ([start var-lib-snapd-snap-skype-54.mount] failed with exit status 1: Job for var-lib-snapd-snap-skype-54.mount failed.
See "systemctl status var-lib-snapd-snap-skype-54.mount" and "journalctl -xe" for details.
)

snap changes

[zyga@faroe ~]$ snap changes
ID   Status  Spawn                Ready                Summary
30   Error   today at 14:00 CEST  today at 14:10 CEST  Refresh snaps "skype", "gtk-common-themes", "spotify", "lxd"

snap tasks 30

[zyga@faroe ~]$ snap tasks 30
Status  Spawn                Ready                Summary
Done    today at 14:00 CEST  today at 14:10 CEST  Ensure prerequisites for "skype" are available
Undone  today at 14:00 CEST  today at 14:10 CEST  Download snap "skype" (54) from channel "stable"
Done    today at 14:00 CEST  today at 14:09 CEST  Fetch and check assertions for snap "skype" (54)
Error   today at 14:00 CEST  today at 14:09 CEST  Mount snap "skype" (54)
Hold    today at 14:00 CEST  today at 14:09 CEST  Run pre-refresh hook of "skype" snap if present
Hold    today at 14:00 CEST  today at 14:09 CEST  Stop snap "skype" services
Hold    today at 14:00 CEST  today at 14:09 CEST  Remove aliases for snap "skype"
Hold    today at 14:00 CEST  today at 14:09 CEST  Make current revision for snap "skype" unavailable
Hold    today at 14:00 CEST  today at 14:09 CEST  Copy snap "skype" data
Hold    today at 14:00 CEST  today at 14:09 CEST  Setup snap "skype" (54) security profiles
Hold    today at 14:00 CEST  today at 14:09 CEST  Make snap "skype" (54) available to the system
Hold    today at 14:00 CEST  today at 14:09 CEST  Automatically connect eligible plugs and slots of snap "skype"
Hold    today at 14:00 CEST  today at 14:09 CEST  Set automatic aliases for snap "skype"
Hold    today at 14:00 CEST  today at 14:09 CEST  Setup snap "skype" aliases
Hold    today at 14:00 CEST  today at 14:09 CEST  Run post-refresh hook of "skype" snap if present
Hold    today at 14:00 CEST  today at 14:09 CEST  Start snap "skype" (54) services
Hold    today at 14:00 CEST  today at 14:09 CEST  Clean up "skype" (54) install
Hold    today at 14:00 CEST  today at 14:09 CEST  Run configure hook of "skype" snap if present
Done    today at 14:00 CEST  today at 14:00 CEST  Ensure prerequisites for "gtk-common-themes" are available
Done    today at 14:00 CEST  today at 14:04 CEST  Download snap "gtk-common-themes" (701) from channel "stable"
Done    today at 14:00 CEST  today at 14:04 CEST  Fetch and check assertions for snap "gtk-common-themes" (701)
Done    today at 14:00 CEST  today at 14:05 CEST  Mount snap "gtk-common-themes" (701)
Done    today at 14:00 CEST  today at 14:05 CEST  Run pre-refresh hook of "gtk-common-themes" snap if present
Done    today at 14:00 CEST  today at 14:05 CEST  Stop snap "gtk-common-themes" services
Done    today at 14:00 CEST  today at 14:05 CEST  Remove aliases for snap "gtk-common-themes"
Done    today at 14:00 CEST  today at 14:05 CEST  Make current revision for snap "gtk-common-themes" unavailable
Done    today at 14:00 CEST  today at 14:05 CEST  Copy snap "gtk-common-themes" data
Done    today at 14:00 CEST  today at 14:05 CEST  Setup snap "gtk-common-themes" (701) security profiles
Done    today at 14:00 CEST  today at 14:05 CEST  Make snap "gtk-common-themes" (701) available to the system
Done    today at 14:00 CEST  today at 14:05 CEST  Automatically connect eligible plugs and slots of snap "gtk-common-themes"
Done    today at 14:00 CEST  today at 14:05 CEST  Set automatic aliases for snap "gtk-common-themes"
Done    today at 14:00 CEST  today at 14:05 CEST  Setup snap "gtk-common-themes" aliases
Done    today at 14:00 CEST  today at 14:05 CEST  Run post-refresh hook of "gtk-common-themes" snap if present
Done    today at 14:00 CEST  today at 14:05 CEST  Start snap "gtk-common-themes" (701) services
Done    today at 14:00 CEST  today at 14:05 CEST  Clean up "gtk-common-themes" (701) install
Done    today at 14:00 CEST  today at 14:05 CEST  Run configure hook of "gtk-common-themes" snap if present
Done    today at 14:00 CEST  today at 14:00 CEST  Ensure prerequisites for "spotify" are available
Done    today at 14:00 CEST  today at 14:08 CEST  Download snap "spotify" (21) from channel "stable"
Done    today at 14:00 CEST  today at 14:09 CEST  Fetch and check assertions for snap "spotify" (21)
Done    today at 14:00 CEST  today at 14:09 CEST  Mount snap "spotify" (21)
Done    today at 14:00 CEST  today at 14:09 CEST  Run pre-refresh hook of "spotify" snap if present
Done    today at 14:00 CEST  today at 14:09 CEST  Stop snap "spotify" services
Done    today at 14:00 CEST  today at 14:09 CEST  Remove aliases for snap "spotify"
Done    today at 14:00 CEST  today at 14:09 CEST  Make current revision for snap "spotify" unavailable
Done    today at 14:00 CEST  today at 14:09 CEST  Copy snap "spotify" data
Done    today at 14:00 CEST  today at 14:09 CEST  Setup snap "spotify" (21) security profiles
Done    today at 14:00 CEST  today at 14:09 CEST  Make snap "spotify" (21) available to the system
Done    today at 14:00 CEST  today at 14:09 CEST  Automatically connect eligible plugs and slots of snap "spotify"
Done    today at 14:00 CEST  today at 14:09 CEST  Set automatic aliases for snap "spotify"
Done    today at 14:00 CEST  today at 14:09 CEST  Setup snap "spotify" aliases
Done    today at 14:00 CEST  today at 14:09 CEST  Run post-refresh hook of "spotify" snap if present
Done    today at 14:00 CEST  today at 14:09 CEST  Start snap "spotify" (21) services
Done    today at 14:00 CEST  today at 14:09 CEST  Clean up "spotify" (21) install
Done    today at 14:00 CEST  today at 14:09 CEST  Run configure hook of "spotify" snap if present
Done    today at 14:00 CEST  today at 14:00 CEST  Ensure prerequisites for "lxd" are available
Done    today at 14:00 CEST  today at 14:06 CEST  Download snap "lxd" (8636) from channel "stable"
Done    today at 14:00 CEST  today at 14:06 CEST  Fetch and check assertions for snap "lxd" (8636)
Done    today at 14:00 CEST  today at 14:06 CEST  Mount snap "lxd" (8636)
Done    today at 14:00 CEST  today at 14:06 CEST  Run pre-refresh hook of "lxd" snap if present
Done    today at 14:00 CEST  today at 14:06 CEST  Stop snap "lxd" services
Done    today at 14:00 CEST  today at 14:07 CEST  Remove aliases for snap "lxd"
Done    today at 14:00 CEST  today at 14:07 CEST  Make current revision for snap "lxd" unavailable
Done    today at 14:00 CEST  today at 14:07 CEST  Copy snap "lxd" data
Done    today at 14:00 CEST  today at 14:08 CEST  Setup snap "lxd" (8636) security profiles
Done    today at 14:00 CEST  today at 14:08 CEST  Make snap "lxd" (8636) available to the system
Done    today at 14:00 CEST  today at 14:08 CEST  Automatically connect eligible plugs and slots of snap "lxd"
Done    today at 14:00 CEST  today at 14:08 CEST  Set automatic aliases for snap "lxd"
Done    today at 14:00 CEST  today at 14:09 CEST  Setup snap "lxd" aliases
Done    today at 14:00 CEST  today at 14:09 CEST  Run post-refresh hook of "lxd" snap if present
Done    today at 14:00 CEST  today at 14:09 CEST  Start snap "lxd" (8636) services
Done    today at 14:00 CEST  today at 14:09 CEST  Remove data for snap "lxd" (8393)
Done    today at 14:00 CEST  today at 14:09 CEST  Remove snap "lxd" (8393) from the system
Done    today at 14:00 CEST  today at 14:09 CEST  Clean up "lxd" (8636) install
Done    today at 14:00 CEST  today at 14:09 CEST  Run configure hook of "lxd" snap if present

......................................................................
Mount snap "skype" (54)

2018-09-17T14:09:51+02:00 ERROR [start var-lib-snapd-snap-skype-54.mount] failed with exit status 1: Job for var-lib-snapd-snap-skype-54.mount failed.
See "systemctl status var-lib-snapd-snap-skype-54.mount" and "journalctl -xe" for details.

systemctl status var-lib-snapd-snap-skype-54.mount

● var-lib-snapd-snap-skype-54.mount
   Loaded: not-found (Reason: No such file or directory)
   Active: failed (Result: protocol) since Mon 2018-09-17 14:09:45 CEST; 4min 7s ago

wrz 17 14:09:41 faroe systemd[1]: Mounting Mount unit for skype, revision 54...
wrz 17 14:09:45 faroe systemd[1]: var-lib-snapd-snap-skype-54.mount: Mount process finished, but there is no mount.
wrz 17 14:09:45 faroe systemd[1]: var-lib-snapd-snap-skype-54.mount: Failed with result 'protocol'.
wrz 17 14:09:45 faroe systemd[1]: Failed to mount Mount unit for skype, revision 54.

snap version

snap    2.35-1.fc28
snapd   2.35-1.fc28
series  16
fedora  28
kernel  4.17.19-200.fc28.x86_64

#11

To reproduce this issue please just run sudo make install with the following Makefile

numbers=$(shell seq 100)

.PHONY: all
all: $(foreach nr,$(numbers),test-$(nr)_1_all.snap)

test-%_1_all.snap:
	mkdir test-$*
	mkdir -p test-$*/meta
	echo "name: test-$*" > test-$*/meta/snap.yaml
	echo "version: 1" >> test-$*/meta/snap.yaml
	echo $* > test-$*/id
	snap pack test-$* .

.PHONY: install 
install: $(foreach nr,$(numbers),test-$(nr)_1_all.snap)
	echo $^ | xargs -n 1 -P $(words $(numbers)) snap install --dangerous

.PHONY: clean
clean:
	rm -rf test-*.snap test-*

I also looked at util-linux and I found a bug there. I will try to fix the bug locally and see if that fixes the issue.


Snap refresh fail for multiple snaps
#12

I added synchronization around mount and I can no longer reproduce the issue. I will now look at patching util-linux properly and propose this upstream.


#13

Some attempts were done to reproduce the problem by just using [u]mount, but so far without success. A reproducer that generates systemd mount units, does daemon-reload, start & stop can be found here: https://gist.github.com/bboozzoo/d4b142229b1915ef7cc0cf8593599ad9 so far it takes just 1-2 iterations for this to fail on Fedora, Arch, or Ubuntu 18.10


#14

Reported this to systemd:


#15

I have 45 snaps installed in my system and encountered this issue, is there any patches available?


#16

We have an experimental patch in https://github.com/snapcore/snapd/pull/6243 that seems to fix the issue for good - however we need to clean this up some more, as it is it cannot be merged (code violates some design rules). But we plan to work on making it mergeable soon.


#17

Not sure what’s the cause, but I eventually remove and reinstall the entire snapd and all of my snaps to resolve this problem.