I believe I have found a severe performance problem with how the LXD build environment’s hostname is configured when using SNAPCRAFT_BUILD_ENVIRONMENT=lxd. Although the /etc/hostname is configured correctly (matches the LXD container name), the uname() system call (and /proc/sys/kernel/hostname) both return the constant string “ubuntu” instead of the container name. As a result, any time sudo is used during the build (either by snapcraft itself, or by a script in snapcraft.yaml), there is a six-second delay while sudo’s attempt to resolve the hostname “ubuntu” times out.
Here is a trivial snapcraft.yaml that demonstrates the problem:
name: test-slow-sudo
summary: Quick demo of slow build problem caused by LXD hostname mis-config
description: |
The host name returned by uname() syscall (and /proc/sys/kernel/hostname)
is the constant string "ubuntu" which does not exist. This causes all
invocations of "sudo" to incur a 6-second timeout penalty.
version: "1"
confinement: devmode
grade: devel
base: core20
parts:
test:
plugin: nil
build-packages:
- sudo
override-build: |
echo "Hostname according to uname(): $(hostname)"
echo "Hostname according to /proc/sys/kernel/hostname: $(</proc/sys/kernel/hostname)"
echo "Hostname according to /etc/hostname: $(</etc/hostname)"
for ((i = 0; i < 4; i++)); do
sudo true
echo "Count: $i Elapsed: $SECONDS"
done
if (( $SECONDS > 5 )); then
echo "FAILURE. The sudo command is very slow. Elapsed: $SECONDS"
exit 1
else
echo "SUCCESS. The sudo command is performing OK."
fi
The problem only occurs if the snapcraft command had to create the LXD container (e.g. right after a snapcraft clean
). However, if the container already exists, then the hostnames are all correct and the problem does not occur. Also, the problem does not occur when using mutlipass.
My host machine is Ubuntu 20.04, with these snap versions:
lxd 4.13 20309 latest/stable canonical✓ -
snapcraft 4.7.1 6466 latest/stable/… canonical✓ classic
snapd 2.49.2 11588 latest/stable canonical✓ snapd
Here is a transcript:
$ snapcraft clean
$ snapcraft
[ ... snip ... ]
snapd is not logged in, snap install commands will use sudo
sudo: unable to resolve host ubuntu: Temporary failure in name resolution
snap "core20" has no updates available
Pulling test
+ snapcraftctl pull
Building test
+ set +x
Hostname according to uname(): ubuntu
Hostname according to /proc/sys/kernel/hostname: ubuntu
Hostname according to /etc/hostname: snapcraft-test-slow-sudo
sudo: unable to resolve host ubuntu: Temporary failure in name resolution
Count: 0 Elapsed: 6
sudo: unable to resolve host ubuntu: Temporary failure in name resolution
Count: 1 Elapsed: 12
sudo: unable to resolve host ubuntu: Temporary failure in name resolution
Count: 2 Elapsed: 18
sudo: unable to resolve host ubuntu: Temporary failure in name resolution
Count: 3 Elapsed: 24
FAILURE. The sudo command is very slow. Elapsed: 24
Failed to run 'override-build': Exit code was 1.
Notice the incorrect hostname reported by the kernel interface above.
Now run it again without destroying the container and get much better results:
$ snapcraft
Launching a container.
[ ... snip ... ]
snapd is not logged in, snap install commands will use sudo
snap "core20" has no updates available
Skipping pull test (already ran)
Building test
+ set +x
Hostname according to uname(): snapcraft-test-slow-sudo
Hostname according to /proc/sys/kernel/hostname: snapcraft-test-slow-sudo
Hostname according to /etc/hostname: snapcraft-test-slow-sudo
Count: 0 Elapsed: 0
Count: 1 Elapsed: 0
Count: 2 Elapsed: 0
Count: 3 Elapsed: 0
SUCCESS. The sudo command is performing OK.
Staging test
[ ... snip ... ]
Motivation: I cannot use multipass because nested virtualization is not allowed on AWS EC2, so need to use LXD. Also, when doing automated builds, the LXD container will not be preexisting.