Snaps autoclean + disable cache+seeds for Juju CI/CD

Dear Snap Team,

Can you please share the link to snapd disk usage optimisations/internals?

Problem 1: Users laptop (my) allocates noticeable amount for snap files (without ability to clean?).

Problem 2: Canonical self-hosted runners reserves noticeable amount of free space for snaps cache/seeds.


Problem 1 details. The findings on my laptop:

> du -sh /var/lib/snapd/snaps /var/lib/snapd/seed/snaps/
13G	/var/lib/snapd/snaps
2.7G	/var/lib/snapd/cache/
834M	/var/lib/snapd/seed/snaps/

Note: firefox_*.snap exist 11 times!

10:36:33 âś” taurus:~$ ls -lah /var/lib/snapd/snaps/firefox*
-rw-------  1 root root 255598592 May 15 02:12 firefox_2666.snap
-rw-------  1 root root 256364544 May 22 01:41 firefox_2696.snap
-rw-------  1 root root 255090688 May 25 08:01 firefox_2713.snap
-rw-------  1 root root 252964864 Jul  5 16:59 firefox_2865.snap
-rw-------  1 root root 248451072 Jul 18 20:52 firefox_2934.snap
-rw-------  1 root root 249819136 Aug  3 08:55 firefox_2967.snap
-rw-------  2 root root 248729600 Aug 14 11:26 firefox_2987.snap
-rw-------  2 root root 249700352 Aug 10 10:16 firefox_3001.snap
-rw-------  1 root root 249679872 Aug 14 09:00 firefox_3016.snap
-rw-------  1 root root 249290752 Aug 17 09:04 firefox_3027.snap
-rw-------  1 root root 249290752 Aug 21 20:54 firefox_3039.snap

I cannot find any snap clean (like an apt clean) to force local cleanup (if I urgently need 15G locally).

Proposal 1: add snap clean to remove *.snap files from mentioned above folders, to mimic common apt clean logic. Also consider to execute it periodically to keep 1-2 the latest snap revisions only.


Problem 2: the similar issue we have on Canonical self-hosted runners which we are using for CI/CD in Juju charms.

Each Juju VM requires 2Gb of disk space (partially because we store snap cache+seeds on each Juju VM, including Juju controller). See the initial thread here.

The list of cached snap artifacts after the test: https://pastebin.canonical.com/p/GfNRHs2XPZ/ , please notice there the same snaps on the server and inside each LXD container.

Proposal 2: allow refresh.retain=0 (to be set for CI/CD containers, etc):

> sudo snap set system refresh.retain=0
error: cannot perform the following tasks:
- Run configure hook of "core" snap (run hook "configure": retain must be a number between 2 and 20, not "0")

P.S. on the same server, snaps inside cache and inside seeds are linked to the same inode, but it doesn’t help for containers and doesn’t work in a real life (seeded 1969, but cached 2666+):

11:14:52 âś” taurus:~$ ls -lah /var/lib/snapd/snaps/firefox*
-rw------- 1 root root 244M May 15 02:12 /var/lib/snapd/snaps/firefox_2666.snap
-rw------- 1 root root 245M May 22 01:41 /var/lib/snapd/snaps/firefox_2696.snap
-rw------- 1 root root 244M May 25 08:01 /var/lib/snapd/snaps/firefox_2713.snap
-rw------- 1 root root 242M Jul  5 16:59 /var/lib/snapd/snaps/firefox_2865.snap
-rw------- 1 root root 237M Jul 18 20:52 /var/lib/snapd/snaps/firefox_2934.snap
-rw------- 1 root root 239M Aug  3 08:55 /var/lib/snapd/snaps/firefox_2967.snap
-rw------- 2 root root 239M Aug 10 10:16 /var/lib/snapd/snaps/firefox_3001.snap
-rw------- 1 root root 193M Aug 13 13:48 /var/lib/snapd/snaps/firefox_3005.snap
-rw------- 1 root root 239M Aug 14 09:00 /var/lib/snapd/snaps/firefox_3016.snap
11:14:55 âś” taurus:~$ ls -lah /var/lib/snapd/seed/snaps/firefox*
-rw------- 1 root root 239M Oct 20  2022 /var/lib/snapd/seed/snaps/firefox_1969.snap

Question 3: Why do we seed revision 1969 if channel deploys 2987?

> cat /var/lib/snapd/seed/seed.yaml
snaps:
  -
    name: firefox
    channel: stable/ubuntu-22.10
    file: firefox_1969.snap
...

> snap switch firefox --channel stable/ubuntu-22.10
"firefox" switched to the "latest/stable/ubuntu-22.10" channel

> sudo snap refresh firefox
Download snap "firefox" (2987) from channel "latest/stable/ubuntu-22.10" 

Thank you!

Thanks for your message and sorry for my slow reply. I will reply in separate posts for each problem described here.

Problem (1) is a bug and we created a LP report https://bugs.launchpad.net/snapd/+bug/2033268 for it. Most likely the issue is triggered by the new “refresh-app-awareness” feature in snapd that will prevent refreshes when snaps are running, it seems that there is a code-path that does not clean up downloads when a refresh version is skipped. We are looking into this.

1 Like

As for problem (2) - there is work for us here to make snaps more lean on systems like this. We plan to add this to the roadmap for the next cycle (SNAPDENG-8504 is the internal jira ticket for this).

Working in this area will benefit the cloud use-cases more generally. Right now snapd is very focused on robustness. This is why refresh.retain=2 is currently the lowest supported setting, i.e. there must always be one snap available to get back to (either automatically or via snap revert). However I can see that we would support refresh.retain=1 i.e. after a successful refresh remove the backup version of the snap as the last operation (note that retain=0 is not possible right now because the snap content comes from the snap file via a mount so removing the snap file would not work with the current design).

We could apply a similar idea to the "seed"directory. it is important to bootstrap a system but in the cloud case it’s snaps could probably be cleaned in the cloud/container use cases where going back to a factory state is not a concern.

The last point is about sharing snap files between various containers. It’s something we should also investigate but it seems the most tricky one. In a sense it’s not a new problem, there will be multiple copies of e.g. the systemd binary in /var/snap/lxd/common/lxd/storage-poools/default/containrs/ and many more things with identical hashes in different places. But of course the core snaps make it more visible due to the size. Putting the storage pool on a content deduplicating filesystem like zfs should help here short term.

I hope this helps.

1 Like

Bug #2033268 “Leaks snaps in r-a-a mode” : Bugs : snapd fixed in snapd 2.62 (available in beta channel), test feedback would be appreciated