Failed to install checkbox-snappy in arm64 machine


#1

Hi all,

We have a arm64 based PCIe network card plugged into a generic amd64 machine as host, which was deployed with MAAS. After host machine is ready by MAAS, a script will setup the environment to install a custom Ubuntu server image into arm64 card, and it’s phase 1. In phase 2 the script will try to install checkbox-snappy but it seems we got the weird failures when the system did it:

22:02:28 + snap install core
22:02:28 2018-12-26T14:02:09Z INFO Waiting for restart...
22:04:14 core 16-2.36.3 from 'canonical' installed
22:04:14 + snap wait system seed.loaded
22:04:14 + snap install checkbox-snappy --devmode --beta
22:04:14 error: cannot perform the following tasks:
22:04:14 - Run install hook of "checkbox-snappy" snap if present (run hook "install": 
22:04:14 -----
22:04:14 cannot read expected data from eventfd: Interrupted system call
22:04:14 support process for mount namespace capture exited abnormally
22:04:14 -----)

I feel that this symptom isn’t 100% reproducible because I can’t reproduce it manually. So guessing it exists failure rate.

There are some logs from arm64:

ubuntu@localhost:~$ snap list
Name Version Rev Tracking Publisher Notes
checkbox-snappy 16.8 1609 beta ce-certification-qa devmode
core 16-2.36.3 6133 stable canonical� core

ubuntu@localhost:~$ snap version
snap 2.36.3
snapd 2.36.3
series 16
ubuntu 18.04
kernel 4.18.0-1000-mellanox

The temporarily workaround I did is to add a delay (e.g. sleep 30) between core and checkbox-snappy installation. It’s fine to me, but I’m more interested in this case.

I checkout the error messages came from 2.36 branch of https://github.com/snapcore/snapd/blob/master/cmd/snap-confine/snap-confine.c

But for master branch, seems there are lots of improvement for NS and I think above messages won’t be grabbed anymore since they are already gone.

Anyone experienced this case before?