Apparmor parser performance on pi3

pstolowski · August 23, 2019, 9:56am

As part of the ongoing story of measuring and improving snapd performance [1] I took a look at the performance of the apparmor parser on pi3, with focus on the pessimistic case where all apparmor profiles need to be regenerated on snapd startup due to system-key change or change to profiles themselves (this doesn’t happen very often but when it does, it can have significant impact on snapd startup time, depending on the number of snaps in the system).

For the test a pi3 with core18 has been used, and 11 random snaps from the store.

The test

To force refreshing of all profiles, I had manually modified all of them and also manually changed the system-key, then rebooted the system. snapd recognized the changes and recreated all profiles, startup timings look as follows:

$ snap debug timings --startup=ifacemgr

ID        Status        Doing      Undoing  Summary
ifacemgr                    -            -  
 ^                       27ms            -    setup security backend "seccomp" for snap "doctl"
 ^                      670ms            -    setup security backend "apparmor" for snap "doctl"
  ^                     622ms            -      load changed security profiles of snap "doctl"
  ^                      11ms            -      load unchanged security profiles of snap "doctl"
 ^                        8ms            -    setup security backend "seccomp" for snap "hello-world"
 ^                      834ms            -    setup security backend "apparmor" for snap "hello-world"
  ^                     593ms            -      load changed security profiles of snap "hello-world"
  ^                      11ms            -      load unchanged security profiles of snap "hello-world"
 ^                       88ms            -    setup security backend "seccomp" for snap "hugo"
 ^                      678ms            -    setup security backend "apparmor" for snap "hugo"
  ^                     626ms            -      load changed security profiles of snap "hugo"
  ^                      11ms            -      load unchanged security profiles of snap "hugo"
 ^                       24ms            -    setup security backend "apparmor" for snap "snapd"
 ^                      731ms            -    setup security backend "apparmor" for snap "ascii-patrol"
  ^                     683ms            -      load changed security profiles of snap "ascii-patrol"
  ^                      11ms            -      load unchanged security profiles of snap "ascii-patrol"
 ^                       81ms            -    setup security backend "seccomp" for snap "nano-strict"
 ^                      619ms            -    setup security backend "apparmor" for snap "nano-strict"
  ^                     523ms            -      load changed security profiles of snap "nano-strict"
  ^                      11ms            -      load unchanged security profiles of snap "nano-strict"
 ^                      601ms            -    setup security backend "apparmor" for snap "pi"
  ^                     458ms            -      load changed security profiles of snap "pi"
  ^                      11ms            -      load unchanged security profiles of snap "pi"
 ^                      527ms            -    setup security backend "apparmor" for snap "core"
  ^                     471ms            -      load changed security profiles of snap "core"
  ^                      11ms            -      load unchanged security profiles of snap "core"
 ^                      580ms            -    setup security backend "apparmor" for snap "htop"
  ^                     457ms            -      load changed security profiles of snap "htop"
  ^                      11ms            -      load unchanged security profiles of snap "htop"

Just for the record, unchanged profiles (if only system-key changes but apparmor profiles are the same) takes just a fraction of that:

$ snap debug timings --startup=ifacemgr

ID        Status        Doing      Undoing  Summary
ifacemgr                    -            -  
 ^                        6ms            -    setup security backend "seccomp" for snap "hello-world"
 ^                       80ms            -    setup security backend "apparmor" for snap "hello-world"
  ^                      24ms            -      load unchanged security profiles of snap "hello-world"
 ^                       32ms            -    setup security backend "apparmor" for snap "ascii-patrol"
  ^                      18ms            -      load unchanged security profiles of snap "ascii-patrol"
 ^                       34ms            -    setup security backend "apparmor" for snap "core"
  ^                      16ms            -      load unchanged security profiles of snap "core"
 ^                       32ms            -    setup security backend "apparmor" for snap "doctl"
  ^                      18ms            -      load unchanged security profiles of snap "doctl"
 ^                       22ms            -    setup security backend "apparmor" for snap "snapd"
 ^                       30ms            -    setup security backend "apparmor" for snap "htop"
  ^                      16ms            -      load unchanged security profiles of snap "htop"
 ^                       32ms            -    setup security backend "apparmor" for snap "hugo"
  ^                      18ms            -      load unchanged security profiles of snap "hugo"
 ^                       39ms            -    setup security backend "apparmor" for snap "nano-strict"
  ^                      17ms            -      load unchanged security profiles of snap "nano-strict"
 ^                       29ms            -    setup security backend "apparmor" for snap "pi"
  ^                      16ms            -      load unchanged security profiles of snap "pi"

Back to the pessimistic scenario. I had tested same scenario with manual invocation of aa_parser, using same command line arguments as snapd when loading changed profiles, e.g.

apparmor_parser --replace --write-cache -O no-expr-simplify --cache-loc=/etc/apparmor.d/cache --skip-read-cache --quiet /var/lib/snapd/apparmor/profiles/<profilename>

Running this via time ... over all profiles on-by-one resulted in the following timings:

real	0m0.650s
user	0m0.631s
sys	    0m0.017s

real	0m0.459s
user	0m0.439s
sys	    0m0.019s

real	0m0.631s
user	0m0.610s
sys	    0m0.019s

real	0m0.465s
user	0m0.436s
sys	    0m0.028s

real	0m0.472s
user	0m0.454s
sys	    0m0.016s

real	0m0.467s
user	0m0.443s
sys	    0m0.023s

real	0m0.463s
user	0m0.434s
sys	    0m0.028s

real	0m0.453s
user	0m0.436s
sys	    0m0.015s

real	0m0.642s
user	0m0.631s
sys	    0m0.009s

real	0m0.467s
user	0m0.451s
sys	    0m0.015s

real	0m0.469s
user	0m0.444s
sys	    0m0.023s

real	0m0.467s
user	0m0.452s
sys     0m0.013s

(note there 12 samples, not 11, because the number of profiles >= number of snaps, depending on apps/hooks).

The total time of all the invokations of aa_parser was:

real  0m6.108s
user  0m5.862s
sys   0m0.228s

To summarize this test, both forceful refresh of all profiles via snapd’s invocation of aa_parser, as well as manual invocation over all profiles on-by-one just like snapd does it gave identical results (as expected) - each invocation took approximately 0.5s, taking 6s total to process all profiles of 12 snaps.

Combined reload of all profiles

The second test is about testing a single invocation of aa_parser against all profiles at once, which is a possible optimization we can do in snapd. This obviously can only be tested manually, e.g.

time /sbin/apparmor_parser --replace --write-cache -O no-expr-simplify --cache-loc=/etc/apparmor.d/cache --skip-read-cache --quiet /var/lib/snapd/apparmor/profiles/snap.*

This gives:

real  0m1.979s
user  0m7.054s
sys   0m0.212s

This is over 3 times faster than when invoking aa_parser multiple times on a slow system such as pi3 (!).

What about PC

Results on a fast PC (amd64) seem to match these trends although total numbers are obviously less spectacular. With 13 snaps on my test PC (amd64 3GHz, 4 cores) it takes 1.5s to regenerate all profiles with multiple invocations of aa_parser, and 0.5s if combined into a single invocation; this will of course be amplified with more snaps on the system.

For the record, the script I used to collect the timings:

#!/bin/bash
AADIR=/var/lib/snapd/apparmor/profiles
{
time (
for AAPROF in $AADIR/snap.*; do
	time /sbin/apparmor_parser --replace --write-cache -O no-expr-simplify --cache-loc=/etc/apparmor.d/cache --skip-read-cache --quiet "$AAPROF"
done
) } 2> timings-total-separate.txt


( time /sbin/apparmor_parser --replace --write-cache -O no-expr-simplify --cache-loc=/etc/apparmor.d/cache --skip-read-cache --quiet $AADIR/snap.* ) > timings-total-combined.txt 2>&1

[1] Some of that brought fruit already in the form of snap debug timings ... infrastructure, as well as optimization of auto-connections on install (tha later will become available in snapd 2.42).

jdstrand · August 26, 2019, 1:44pm

This is due to apparmor_parser’s default of using all the CPUs on the system when given more than one profile. The pi3 has 4 CPUs, so apparmor will default to -j4. Setting --max-jobs=1 would make this more like processing serially (measuring the difference between fork/execs/etc). The question then becomes, for the core refresh scenario, do you want to take all the CPUs on an already constrained device? I’m guessing possibly not, but you can use the --max-jobs argument to make sure you have some overhead (eg, --max-jobs=<total CPUs - 1> or something.