Expose a more consistent subset of systemd's service directives

hcochran · September 27, 2017, 10:47pm

Ditto. We do not like running our services as root but wish to use a system user. I am neutral about whether all we need is support for the User= and Group= directives, vs. launching an instance of systemd --user. The former seems simpler.

hcochran · September 27, 2017, 10:48pm

Systemd has explicit support for that in the form of the ConditionACPower= directive.

jdstrand · September 28, 2017, 2:08am

I’m in the process of allowing snaps to privilege drop to the ‘daemon’ user/group. After that, we’ll be implementing support for using other snapd-managed users. Perhaps User= and Group= could be added after that, but I’d prefer these directives not be added until this other feature is implemented.

jamesh · September 28, 2017, 12:29pm

My particular use case would be desktop session services, so the latter is what I’m after personally. For that use case, it isn’t so much about running as a non-root user as running in the context of the desktop session (and potentially having multiple instances if there are multiple desktop sessions).

chipaca · September 28, 2017, 1:46pm

@hcochran this is exactly he sort of feedback I was hoping for, so thank you very much for sharing your findings. I’ll be discussing your post here and replying in detail later in the day, but didn’t want to let more time pass without saying thank you.

chipaca · September 28, 2017, 8:49pm

Just to update y’awl on where we stand, we’re currently struggling with Wants/Requires/PartOf/BindsTo, trying to find a sane way to expose these. Our problem with the names systemd uses for them is that we find it impossible to remember which directive has which exact semantics, and don’t want to pass this same feature to the snapd world, if at all possible.

Before and After are fine, and clear, and I think uncontroversial.

Conditions, yes we should expose all the relevant ones. Probably export it as a conditions map inside an app, to keep it neat, and probably implemented using the AssertFoo instead of the ConditionFoo, so we have logs (let us know if there’s a reason not to do things this way).

Environment is already supported; we don’t think there’s value in additionally supporting EnvironmentFile, unless there’s an actual use case for it where the file lives in the snap’s data directory and is modified? It sounds rather far-fetched, so let us know if this is the case. If you’re using it to read a file in the snap, snapcraft should be able to pack an environment file into environment stanzas (but this isn’t done yet). If you’re using it to read a file outside of the snap entirely, we wouldn’t want to support that (at least not without a pertinent interface, and that would need further design).

Conflicts and PropagatesReloadTo fit into the bigger conversation we’re having about Requires and etc.

@hcochran is PartOf really useful on its own in practice, given that you can’t really define targets? Or would you also need targets for it to be useful? (I don’t think we want to allow targets – but on the other hand if a dev needs them they’ll end up simulating them with an app that does nothing, so maybe we do).

cratliff · September 29, 2017, 3:25pm

Is the issue just the naming convention? I can understand the desire to simplify wherever possible, but coming from using systemd it would be likely be more confusing to have to remember the mapping for systemd->snap command names, then when attempting to debug the service remembering the opposite direction to remember what to change in the snapcraft.yaml file to modify the service. For a more complex system a transparent passthrough would likely be much less confusing.

That’s pretty close to the situation we are in. We are adding the targets ourselves after the snap installs. This is currently one of the things preventing us from moving to a confined snap. What is the reason for not wanting to allow targets?

hcochran · September 29, 2017, 4:00pm

Systemd is extremely well-documented and is a defacto standard in the Linux world (previous controversies notwithstanding). Therefore, I believe the meanings of these things will only become more widely known with time. I strongly agree with cratliff that mapping systemd names to a different set of names would exacerbate, not help the problem. (This comment applies to the naming of the Condition* directives, although I am unaware how renaming them to Assert* is connected with logging.)

I also think that WantedBy, Wants, RequiredBy, Requires are in very common use with meanings that are fairly intuitive. That leaves BindsTo and PartOf which, while less obvious, are also very useful. In fact, I think BindsTo is basically essential for any product that has hardware which may come and go dynamically and for which you need associated software to start and stop when this happens. It seems to me that this would apply to many embedded, robotics, & IoT-type devices.

The way it works is this: Gadget snap would install a udev rule like this one:

SUBSYSTEM=="usb", ATTRS{idVendor}=="BEEF", MODE:="0666", SYMLINK+="AwesomeCamera", TAG+="uaccess", TAG+="udev-acl", TAG+="systemd", ENV{SYSTEMD_ALIAS}="/dev/AwesomeCamera"

This causes a dynamically-generated systemd unit called “dev-AwesomeCamera.device” to activate whenever this USB device appears and to deactivate when it goes away.

Then, we have a service (in our case a ROS node) that will start and start automatically when this device appears or disappears by adding this to its service file:

[Unit]
BindsTo=dev-AwesomeCamera.device

Without BindsTo=, we would have to use some out-of-band way to notice the removal of the device and stop the corresponding service. This involves some wheel reinvention and may even require polling.

Does that clarify how needed this may be?

Thanks, very much, for considering our feedback.

hcochran · September 29, 2017, 4:13pm

This is exactly the case that we have. We have a config file, for e.g. $SNAP_DATA/ros.env, that sets some environment variables like this:

VAR1=value1
VAR2="other value"

These configuration values vary from machine to machine. Our snap install hook creates a default version of this file which can be modified later, either by hand editing or by downloading a different configuration from our fleet management software.

Most of our configuration files are .yaml also loaded from $SNAP_DATA. But some features of the upstream software we use (ROS) can only be influenced by environment variables.

While I think EnvironmentFile= is important, we can at least work around this using a custom wrapper script for all of our services. While messy, the possibility of a non-confinement-breaking workaround makes this directive less essential, for us, than some of the others under discussion (i.e. all of the depedency & ordering directives, Conflicts, Condition*)

Thanks

hcochran · September 29, 2017, 4:33pm

We only use PartOf= to specify that a service is part of a target, when this directive is important to make a target both start and stop dependent services (rather than only start them, which is the default).

The concept of targets are very important when a large snap represents basically a “whole system” rather than a single application. Such may typically be the case when the system is an IoT, embedded, or robotic device rather than a server or desktop computer. For devices like ours, there are major modes that cause components of the system to start or stop together. For us, targets include this like:

initial-configuration.target - robot navigation and application-level software disabled; contacting fleet management for configuration
maintenance.target - robot navigation and application-level software disabled; ROS nodes for calibration and diagnostics are launched
robot-base.target - robot navigation running but application-level software disabled
normal.target - Full application stack is running. This one Requires=robot-base.target but would Conflict=low-power.target and Conflict=initial-confiration.target for example
low-power.target - Starts a service which commands certain hardware to power off (such as a USB hub). Anything that uses lots of power Conflicts= with this.

OK, hopefully that illustrates a good case for targets. However, it is possible to work around the lack of targets using “empty services” as you mentioned, meaning that we can get what we need without breaking confinement even if snap does not support targets.

However, we cannot get what we need without PartOf= (at least without walking down a wheel-reinvention road where we use scripts to try to do what systemd would have done for us).

Therefore, PartOf= is more “essential” for us than explicit support for targets, even though its main use case is in the context of targets!

Thank you so much for soliciting and considering our feedback. Ubuntu Core + snaps are very useful!

niemeyer · September 29, 2017, 6:06pm

The idea there isn’t being different just for the sake of being different. In many cases we kept the pristine names of systemd, as for example in the socket activation feature that we’re implementing support for we’ll have listen-stream. It wouldn’t be my first choice of terminology, but we preserved it precisely to help mapping between the two worlds.

But the case here is different, in the sense that systemd has very poorly designed terminology around those ideas. Just consider:

Requires
Wants
BindsTo
PartOf
PropagateReloadTo
ReloadPropagatedFrom

All of these are essentially about starting and stopping services on certain events. Yet, we have mixed up the ideas of binding, parts, propagation, soft and hard dependencies, and so on. And this is not only very confusing, but it also burns down terminology in the sense that once we have a part-of term in the stanza with a given meaning, for example, a proper design needs to burn the term because we don’t want multiple uses of that next to each other.

Those terms are also not self-describing, and I commonly see experienced people digging down in the documentation to re-learn what they mean, even for the better cases such as Requires. See for example this question on stackexchange which was viewed six thousand times since it was asked 11 months ago. There are many of those.

So… here is a proposal:

Let’s start with a small set of options that covers the common cases we know about in terms of inter-dependency between services. Some good candidates we discussed:

starts-with: <other> | [<other>, …]
stops-with: <other> | [<other>, …]
runs-with: <other> | [<other>, …]

Update: As noted much later below, it’s been a while but I’ve been thinking of “starts-with” and “stops-with” for reasons unrelated to snapd, and we’ve settled for “requires” as the term, as it makes the directionality of the dependency clear, and the implied semantics are easier to grasp.

The first two will be appropriately mapped to Requires, Wants, or PartOf as appropriate to obtain the intended semantics. The last one, runs-with, leverages BindsTo, which unfortunately is the only option that includes the idea of exiting together with another service’s self-termination. Wasn’t for that, we might also have an independent exits-with option.

Then, as a follow up step, let’s look into how to properly map the support for targets into snaps in a way that is both safe and convenient. We probably won’t need new terminology for that, other than perhaps something to define the targets themselves. We do need to consider the issue of confined interaction with the system, and whether we want snaps to operate on a global namespace or individual namespace, and if global, how maintain sanity on that namespace.

In either case, from what I understand on your description, all of those issues are points we want to make work, so thanks for engaging with us and let’s push it forward.

kyrofa · October 24, 2017, 4:32pm

Hey all! This thread has been an excellent discussion, and I want to ensure it continues beyond the rally. Have we managed to make any headway, here?

chipaca · October 24, 2017, 4:44pm

I’m slowly breaking it down. I’ve got half of the conditions PR done, but it’s stuck behind some higher priority work for now.

kyrofa · October 24, 2017, 5:30pm

Thanks for the update, @chipaca .

awe · October 26, 2017, 7:25pm

Just wanted to second the point made by @morphis earlier in the thread about WatchdogSec= being important. We’ve been asked about future (ie. 18.04) support for application based watchdog timers by one large OEM in particular.

@chipaca will your initial work include After= and Before= conditions? Also any thoughts as to a potential version this of snapd this will land in?

jeremyarr · January 16, 2018, 5:08am

+1 to service ordering as well. I have a use case that where a snap has an mqtt broker service and a number of services that connect to the mqtt broker. Would be great if I could start the dependent services only after the mqtt broker service is started.

niemeyer · January 16, 2018, 7:26am

Work on this feature is already under way. It will be part of the next release or the one after it in the worst case.

jeremyarr · January 18, 2018, 1:39am

Great, I also +1 the request for watchdog support. Currently if a start rate limit has been reached for a service, the service is put in a permanently failed state and no remedial action is possible. I’d love the option to be able to reboot the system under this scenario. Such a feature is particularly important for unattended consumer IoT applications.
The current workaround of editing the unit files through a configure hook during installation obviously breaks confinement as @hcochran has noted.

mborzecki · January 19, 2018, 8:21am

Took a quick stab at watchdog support https://github.com/snapcore/snapd/pull/4504 It seems that due to security concerns this will require some additional work though.

ribalkin · January 26, 2018, 9:24pm

Is there any branch with these systemd improvements (post command, start timeout…)?