When the seccomp backend writes out the security profile, this profile is currently a text file of the form:
# enforce pid_t is 0 so the app may only change its own scheduler and affinity.
# Use process-control interface for controlling other pids.
sched_setaffinity 0 - -
sched_setparam 0 -
When snap-confine runs it read this text-file and uses a internal parser and a symbol database to resolve the text and load the generated bpf program via libseccomp.
This approach has some problems:
the snap-confine that read the text file needs to be in sync with the code that generates the text file. New syntax or symbols make snap-confine break (e.g. this race).
snap-confine has code to deal with the mapping of symbols to names and to parse the syntax, all in a suid root binary
the bpf code is compiled every time snap-confine runs
An alternative approach would be:
snapd itself writes the bpf binary representation for snap-confine (either via a tiny wraper to libseccomp or by calling an external helper)
snap-confine is simplified to just read the binary bpf file and load that into the kernel
The advantage is that we can introduce new syntax/symbols in our seccomp text file at any time and snap-confine still will be able to load the profile (because it just hands it to the kernel). Also snap-confine gets simpler and we can remove our map implementation. The downside is that the binary profiles on disk are no longer readable by a human.
The only binary profiles on disk issue is a major limiting factor to this approach as described because it is important for people to be able to examine/audit the profiles and important for developers to be able to modify them. If we look at apparmor we see that it already does something similar: it has /var/lib/snapd/apparmor/profiles/snap… and /var/cache/apparmor/snap… The files in /var/cache/apparmor are binary files that are loaded straight into the kernel, and the files in /var/lib/snapd/apparmor/profiles are the textual representations like always.
I think the BPF binary idea is interesting and we can be inspired by what apparmor does:
snapd writes out human readable profiles in /var/lib/snapd/seccomp/profiles like it does now
snapd compiles these to /var/cache/snapd/seccomp (or similarly)
snap-confine only ever loads /var/cache/snapd/seccomp/… and if it doesn’t exist, exit with error
We need to make sure that the binary files are never stale and we need to provide a tool for the developer to compile updated policy to /var/cache/snapd/seccomp, but if we do these things right, the BPF approach should be more robust while being auditor/developer friendly (enough).
I agree with Jamie, I think this is exactly what we should do. The existing parser in snap-confine can be relatively easily moved to a non-setuid root binary that handles parsing and manages the cache. If there are no disagreements I can pick this up next week.
I like the idea of Jamie to model this after what apparmor is doing. What is exciting about this is that we can write most of the snap-confine seccomp handling directly in go and only a tiny bit of code in snap-confine would load the bpf profile. I started with the seccomp->bpf bits in https://github.com/snapcore/snapd/compare/master...mvo5:seccomp-bpf?expand=1 as an experiment and I am very happy so far.