I’ve been looking into how we can run a snap through unsquashfs, rebuild it using mksquashfs, and finally compare the result to the original snap to see if there are any differences. This is a useful thing to be able to do in the automated review-tools to detect any anomalies in the uploaded snap. The ability to do this requires that the generation of a squashfs/snap file being predictable and reproducible. The same input should produce the same output.
One issue that I’ve ran into when trying to rebuild a snap in a predictable manner is the fragment handling threads used inside mksquashfs. They introduce unpredictability in the resulting squashfs image because they’re doing work to gather, merge, and compress “fragments” in parallel. Fragments are files that are less than the squashfs block size (131072 bytes by default) that are merged together inside of a single block and then compressed together to reduce the size of the resulting squashfs file.
There are three possible solutions to this problem:
- Remove the multi-threaded nature of the fragment handling threads
- Adjust the behavior of the fragment handling threads to produce consistent results
- Don’t use squashfs fragments in snaps
#1 and #2 are both problematic in that they require code changes to mksquashfs. This could be difficult to roll out everywhere (thinking cross-distro) since snapcraft uses the system mksquashfs.
Someone has already implemented a patch which does #1 but it significantly impacts performance.
To my knowledge, nobody has previously attempted #2. There’s an upstream feature request with no action taken. This option doesn’t look trivial, could take considerable work, and we still have the problem of getting the patched mksquashfs out to the world.
#3 is my favored solution. The negative here is that disabling fragments in mksquashfs (using the -no-fragments) option increases the size of the squashfs file. I was hoping that the increase in snap size would be offset by smaller xdelta3 binary diffs since the snaps would be more predictable since today, if you rebuild the same snap twice on a system with a fair number of cores, you’ll likely see a difference between the two snaps. However, I’ve analyzed a sequence of LXD and atom snaps but I’m not seeing the xdelta3 diffs consistently being any smaller. Here are the results where snaps built with the mksquashfs -no-fragments option containing “nofrag” in the name:
Atom
$ analyze-nofrag.sh atom_76.snap atom_88.snap atom_91.snap atom_94.snap atom_97.snap
### Resquashing atom_76.snap with no fragments
173M atom_76.snap
178M atom_76-nofrag.snap
### Resquashing atom_88.snap with no fragments
173M atom_88.snap
178M atom_88-nofrag.snap
816K atom_76-to-atom_88.xdelta3
708K atom_76-nofrag-to-atom_88-nofrag.xdelta3
### Resquashing atom_91.snap with no fragments
173M atom_91.snap
178M atom_91-nofrag.snap
408K atom_88-to-atom_91.xdelta3
468K atom_88-nofrag-to-atom_91-nofrag.xdelta3
### Resquashing atom_94.snap with no fragments
173M atom_94.snap
178M atom_94-nofrag.snap
492K atom_91-to-atom_94.xdelta3
456K atom_91-nofrag-to-atom_94-nofrag.xdelta3
### Resquashing atom_97.snap with no fragments
173M atom_97.snap
178M atom_97-nofrag.snap
1.3M atom_94-to-atom_97.xdelta3
1.2M atom_94-nofrag-to-atom_97-nofrag.xdelta3
LXD
$ analyze-nofrag.sh lxd_5041.snap lxd_5061.snap lxd_5072.snap lxd_5182.snap lxd_5235.snap
### Resquashing lxd_5041.snap with no fragments
43M lxd_5041.snap
43M lxd_5041-nofrag.snap
### Resquashing lxd_5061.snap with no fragments
43M lxd_5061.snap
43M lxd_5061-nofrag.snap
16K lxd_5041-to-lxd_5061.xdelta3
28K lxd_5041-nofrag-to-lxd_5061-nofrag.xdelta3
### Resquashing lxd_5072.snap with no fragments
43M lxd_5072.snap
43M lxd_5072-nofrag.snap
20K lxd_5061-to-lxd_5072.xdelta3
24K lxd_5061-nofrag-to-lxd_5072-nofrag.xdelta3
### Resquashing lxd_5182.snap with no fragments
43M lxd_5182.snap
43M lxd_5182-nofrag.snap
17M lxd_5072-to-lxd_5182.xdelta3
17M lxd_5072-nofrag-to-lxd_5182-nofrag.xdelta3
### Resquashing lxd_5235.snap with no fragments
43M lxd_5235.snap
43M lxd_5235-nofrag.snap
14M lxd_5182-to-lxd_5235.xdelta3
14M lxd_5182-nofrag-to-lxd_5235-nofrag.xdelta3
The results show a small increase in size for the large Atom snap and roughly the same size when comparing each xdelta3 binary diff (some slightly larger, some slightly smaller).
You can find the analyze-nofrag.sh
script here: https://gist.github.com/tyhicks/0d23aa4d01bacb2e18e7f6a5a628d157
If this small increase in snap size is deemed acceptable, I’ll propose a change to snapcraft to make use of the -no-fragments option.