Snap debug timings


#1

The upcoming 2.40 release of snapd adds new features to collect timing information for critical activities, and can output timing details with the snap debug timings command.

Timing measurements can be used to discover performance bottlenecks, target areas for improvement, and hopefully enhance upcoming releases of snapd itself.

The synopsis of the timings command is as follows:

snap debug timings [timings-OPTIONS] [<change-id>]

The timings command displays details about the time each task runs.

[timings command options]
      --last=                                                                          Select last change of given type (install, refresh, remove, try, auto-refresh, etc.). A question mark at the end of the type means to do nothing (instead of
                                                                                       returning an error) if no change of the given type is found. Note the question mark could need protecting from the shell.
      --ensure=[auto-refresh|become-operational|refresh-catalogs|refresh-hints|seed]   Show timings for a change related to the given Ensure activity (one of: auto-refresh, become-operational, refresh-catalogs, refresh-hints, seed)
      --all                                                                            Show timings for all executions of the given Ensure or startup activity, not just the latest
      --startup=[load-state|ifacemgr]                                                  Show timings for the startup of given subsystem (one of: load-state, ifacemgr)
      --verbose                                                                        Show more information

[timings command arguments]
  <change-id>:                                                                         Change ID

Execution time

At its simplest, snap debug timings displays the execution time of a given change. A change can be specified by its ID, or by using the --last argument with a well defined name for a change, such as install or refresh:

$ snap debug timings 10
ID   Status        Doing      Undoing  Summary
168  Done          788ms            -  Stop snap "lxd" services
 ^                 519ms            -    stop service "snap.lxd.daemon.service"
 ^                 259ms            -    stop service "snap.lxd.activate.service"
169  Done           10ms            -  Run remove hook of "lxd" snap if present
170  Done           19ms            -  Disconnect interfaces of snap "lxd"
171  Done        56455ms            -  Save data of snap "lxd" in automatic snapshot set #1
172  Done           19ms            -  Remove aliases for snap "lxd"
173  Done          237ms            -  Make snap "lxd" unavailable to the system
174  Done            9ms            -  Remove security profile for snap "lxd" (11098)
175  Done           10ms            -  Remove data for snap "lxd" (11098)
176  Done          569ms            -  Remove snap "lxd" (11098) from the system
177  Done          767ms            -  Remove data for snap "lxd" (10972)
178  Done          580ms            -  Remove snap "lxd" (10972) from the system
179  Done         1130ms            -  Disconnect lxd:lxd-support from core:lxd-support
 ^                 842ms            -    setup security backend "seccomp" for snap "lxd"
 ^                 266ms            -    setup security backend "apparmor" for snap "lxd"
  ^                181ms            -      load changed security profiles of snap "lxd"
 ^                   5ms            -    setup security backend "apparmor" for snap "core"
180  Done          560ms            -  Disconnect lxd:system-observe from core:system-observe
 ^                 202ms            -    setup security backend "seccomp" for snap "lxd"
 ^                  15ms            -    setup security backend "mount" for snap "lxd"
 ^                 324ms            -    setup security backend "apparmor" for snap "lxd"
  ^                220ms            -      load changed security profiles of snap "lxd"
 ^                   6ms            -    setup security backend "apparmor" for snap "core"
181  Done          180ms            -  Disconnect lxd:network-bind from core:network-bind
 ^                 102ms            -    setup security backend "seccomp" for snap "lxd"
 ^                  56ms            -    setup security backend "apparmor" for snap "lxd"
  ^                 41ms            -      load changed security profiles of snap "lxd"
  ^                  8ms            -      load unchanged security profiles of snap "lxd"
 ^                   5ms            -    setup security backend "apparmor" for snap "core"
182  Done          182ms            -  Disconnect lxd:network from core:network
 ^                 101ms            -    setup security backend "seccomp" for snap "lxd"
 ^                   8ms            -    setup security backend "mount" for snap "lxd"
 ^                  53ms            -    setup security backend "apparmor" for snap "lxd"
  ^                 40ms            -      load changed security profiles of snap "lxd"
  ^                  7ms            -      load unchanged security profiles of snap "lxd"
 ^                   6ms            -    setup security backend "apparmor" for snap "core"

“^” denotes nested timing details beneath a parent process, helping to isolate problems.

Ensure activities

Alongside a change’s execution time, a set of snapd internal activities, called ensure activities, are also tracked.

Ensure activities are activities that are performed by snapd on a regular basis, and they occassionally generate new tasks, as changes, to continue an initial action.

Tracking the execution times of such ensure activities can be useful. For example, auto-refresh involves multiple processes for various installed snaps, including connecting to the Snap Store, and timing information can help troubleshoot refresh issues.

Use the --ensure= argument to see the timing details for a specific ensure activity, for example:

snap debug timings --ensure=auto-refresh
ID            Status        Doing      Undoing  Summary
auto-refresh                    -            -  
 ^                          432ms            -    query store and setup auto-refresh change
146           Done           12ms            -  Ensure prerequisites for "lxd" are available
147           Done         1257ms            -  Download snap "lxd" (11098) from channel "stable"
 ^                         1236ms            -    download snap "lxd"
148           Done          335ms            -  Fetch and check assertions for snap "lxd" (11098)
149           Done          377ms            -  Mount snap "lxd" (11098)
 ^                           36ms            -    check snap "lxd"
 ^                          318ms            -    setup snap "lxd"
150           Done           11ms            -  Run pre-refresh hook of "lxd" snap if present
151           Done          795ms            -  Stop snap "lxd" services
 ^                          262ms            -    stop service "snap.lxd.activate.service"
 ^                          521ms            -    stop service "snap.lxd.daemon.service"
152           Done           29ms            -  Remove aliases for snap "lxd"
153           Done          223ms            -  Make current revision for snap "lxd" unavailable
154           Done           99ms            -  Copy snap "lxd" data
155           Done          403ms            -  Setup snap "lxd" (11098) security profiles
 ^                          364ms            -    setup security backend "apparmor" for snap "lxd"
  ^                         254ms            -      load changed security profiles of snap "lxd"
 ^                            7ms            -    setup security backend "apparmor" for snap "core"
156           Done          302ms            -  Make snap "lxd" (11098) available to the system
 ^                          254ms            -    generate wrappers for snap lxd
157           Done           18ms            -  Automatically connect eligible plugs and slots of snap "lxd"
158           Done           20ms            -  Set automatic aliases for snap "lxd"
159           Done           20ms            -  Setup snap "lxd" aliases
160           Done           14ms            -  Run post-refresh hook of "lxd" snap if present
161           Done          985ms            -  Start snap "lxd" (11098) services
 ^                            6ms            -    start socket service "snap.lxd.daemon.unix.socket"
 ^                          963ms            -    start service "snap.lxd.activate.service"
162           Done           24ms            -  Remove data for snap "lxd" (10934)
163           Done          542ms            -  Remove snap "lxd" (10934) from the system
164           Done            0ms            -  Clean up "lxd" (11098) install
165           Done          246ms            -  Run configure hook of "lxd" snap if present
166           Done          353ms            -  Consider re-refresh of "lxd"

The first item in the output, and the timing nested directly after it, corresponds to the “auto-refresh” ensure activity logic. Following these, the remaining timings with numerical identifiers are tasks spawned by that logic to refresh a particular snap (lxd in this example).

Startup times

The startup timing data for two subsystems, load-state and ifacemgr, are also tracked and can be output with the --startup argument:

snap debug timings --startup=ifacemgr
ID        Status        Doing      Undoing  Summary
ifacemgr                    -            -  
 ^                        8ms            -    setup security backend "apparmor" for snap "core"
 ^                        7ms            -    setup security backend "mount" for snap "lxd"
 ^                       16ms            -    setup security backend "apparmor" for snap "lxd"
  ^                      12ms            -      load unchanged security profiles of snap "lxd"
 ^                        7ms            -    setup security backend "apparmor" for snap "vlc"
  ^                       5ms            -      load unchanged security profiles of snap "vlc"

The above example shows execution times related to snapd’s interface manager, which includes the time to set up the security profiles for all installed snaps and their interfaces before snapd becomes operational.

Implementation details

Finally, it is worth noting that:

  • timing measurements are stored with snapd’s state, making them persistent across reboots
  • execution times of less than 5ms are not recorded
  • snapd stores up to 100 measurements with the oldest dropped to make room for new measurements
  • some timing infrastructure exists in snapd 2.39.x, but snapd 2.40 completes the implementation
  • timings, alongside other snap debug commands, are for manual use and introspection only. They should not be used for scripting or automation as their format may change without notice. This applies to the REST API of snap debug as well.
  • timing measurements are taken for specific areas that we consider critical and worth capturing. Further areas may be considered for future releases.

#2

This is really useful, I just ran it on one of the larger snaps we have and one thing that would be useful is a way to have it sort the task IDs based on the time each one took.
Additionally have a more easily parse-able output that is yamlish would be helpful


#3

Also, does this feature require that the version of the core snap be 2.40 when the change happens, or is it sufficient for the change to have happened when the version of the core snap was 2.39?


#4

2.39 was already collecting some timings internally, so yes, you should see them for changes originating from 2.39, however there were some fixes and additions related to this feature after 2.39 release (for example a few new spots in the code where such timings are collected), therefore it’s becoming “official” with 2.40.

Remark about yamllish output noted, we can consider this.


#5

Another feature that would be really nice to have with this when doing performance testing is an easy way to do diffs between similar changes (i.e. same kind of change for different versions of the same snap or even different versions of snapd)