FYI this is a development post about snapd internals and isn’t probably interesting if you don’t work on snapd itself…
Currently, we have at least 3 situations where snapd code will interact with systemd to control services/daemons.
- When installing/removing snaps and we have to standup the systemd units and start services for the first time (and stop them when it comes time to update).
- When a snap user is managing services and runs something like
- When a snap itself (i.e. the snap author writes this) wants to manage it’s own service and runs something like
The big problem is that while the codepath for 2 and 3 is pretty much the same (they both end up back in servicestate overlord and create
service-control changes with
exec-command tasks), the codepath for 3 is completely different and goes straight to the wrappers/systemd package without creating any sub-tasks, etc. This means we have tasks like
stop-snap-services in the snapstate manager which operates in a completely different manner from
service-control in the servicestate manager w/respect to how things are told to stop.
There is a desire to unify the implementations to make them follow mostly the same codepath, at least when it comes to working with systemd using the
systemctl command because of various wrinkles in working with systemd. Exacerbating this situation is the disparity in how the 2 methods are tested, for one we have a
systemctl function which is run in Go and needs to handle the args passed to it via Go function mocking, while for other tests we have to mock
systemctl as a shell function on the system because running that binary is hard-coded inside wrappers/systemd package.
Externally, mechanisms 2 and 3 will create a
service-control change with
exec-command tasks with the systemctl binary + args just passed in literally to the
exec-command task. This is unfortunate because we already have many high level functions in the systemd package which handle this in a more high level and are more consistently mocked, etc. Mechanism 1 however has a few different changes (i.e.
refresh-name, etc.) that create
start-snap-services tasks which call into the aforementioned systemd package. These tasks are fine right now AFAICT.
Deep dive into the codepaths
This can be skipped if familiar with the code, but is here for posterity
Specifically for 1, in cmd_snap_op.go,
cmdInstall.installOne will call into
client.Install, which eventually posts to
/v2/snaps/<name> on /run/snapd.socket. From there the daemon picks up in daemon/api.go with
postSnap and eventually will get to
snapstate.Install, which after running around a bit will call
stopSnapServices which itself will go call backend.StopServices, … which snakes down into
wrappers.StopServices and finally builds up the systemctl args using wrappers/systemd.go
Specifically for 2, in cmd_services.go,
svcStop.Execute will call into
cli.Stop, which posts to
/v2/apps on /run/snapd.socket. From there the daemon picks up in daemon/api.go with
postApps and will work next into
servicestate.Control which finally will build up the systemctl args and run them with
Specifically for 3, in cmd/snapctl/main.go,
run() will call into
cli.Runsnapctl which posts to
/v2/snapctl on /run/snapd-snap.socket. From there the daemon picks up in daemon/api.go with runSnapctl and will work next into
ctlcmdRun, which then winds back into stopCommand.Execute from ctlcmd package, and finally ends up at the same place with
servicestate.Control and then follows 2 from this point.
I propose the following steps to alleviate the situation:
service-controlchanges no longer create
exec-commandtasks and instead create
stop-snap-servicestasks (possibly with enable a sub-case of
start-snap-servicesand disable a sub-case of
stop-snap-servicesstart tracking internally to the task what services are to be started/stopped, etc. (implementation detail: if this isn’t specified in the task, then do the right thing and assume “all” with the current behavior so that we can leave as much as possible in snapstate_test.go well enough alone )
- Refactor relevant tests to use “the one way” to mock systemctl as needed (probably the mock shell script way I think)
Another alternative which would probably be more work would be to:
- Remove the
stop-snap-servicestasks altogether, replacing them with a (new) kind of task called
service-manageor something specific to managing services
- Refactor the
service-controlchange to use this new
There may indeed be more alternatives, but this is what I’ve come up with.