Cleanup of /tmp/snap-private-tmp/snap.lxd removes ability to connect to running containers


I have a cluster (4 identical machines) running as an lxd cluster. Everything works fine for a day or so and then lxc is unable to connect to the containers. Restarting the containers resolves the issue for another day or so until it happens again. When lxc is unable to connect the containers are still running (I can ssh to them.)

Machine configurations * 4: Dell PowerEdge R6525 AMD EPYC 7282 16-Core Processor 128GB RAM Ubuntu 22.04 LTS lxd/lxc (snap) 5.0.1

Sample output: cmd@cluster01:~$ lxc list ±-------------±--------±--------------------±-----±----------±----------±----------+ | NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS | LOCATION | ±-------------±--------±--------------------±-----±----------±----------±----------+ | ubuntu-test | RUNNING | (eth0) | | CONTAINER | 0 | cluster01 | ±-------------±--------±--------------------±-----±----------±----------±----------+ | ubuntu-test2 | RUNNING | (eth0) | | CONTAINER | 0 | cluster02 | ±-------------±--------±--------------------±-----±----------±----------±----------+ | ubuntu-test3 | RUNNING | (eth0) | | CONTAINER | 0 | cluster03 | ±-------------±--------±--------------------±-----±----------±----------±----------+ | ubuntu-test4 | RUNNING | (eth0) | | CONTAINER | 0 | cluster04 | ±-------------±--------±--------------------±-----±----------±----------±----------+

cmd@cluster01:~$ lxc shell ubuntu-test Error: Failed to retrieve PID of executing child process

cmd@cluster01:~$ lxc console ubuntu-test To detach from the console, press: +a q Error: Error opening config file: “loading config file for the container failed” Error: write /dev/pts/ptmx: file already closed

cmd@cluster01:~$ ssh Last login: Tue Jan 3 21:58:07 2023 from To run a command as administrator (user “root”), use "sudo ". See “man sudo_root” for details.

cmd@ubuntu-test:~$ logout Connection to closed.

cmd@cluster01:~$ lxc restart ubuntu-test

cmd@cluster01:~$ lxc shell ubuntu-test

root@ubuntu-test:~# logout

After getting help[0] it looks like /tmp/snap-private-tmp gets cleaned up automatically[1] and that’s how we ended up here.

Workaround: At the top of /usr/lib/tmpfiles.d/snapd.conf I added: x /tmp/snap-private-tmp/snap.lxd

This excludes the snap.lxd subdir from being “cleaned” which in turn breaks lxc’s ability to connect to containers.

So it looks like the default snap config should exclude snap.lxd from being removed from /tmp/snap-private-tmp/.

References: [0] [1]

you should better use /etc/tmpfiles.d, the file in /usr/lib will likely be replaced blindly on package upgrades …