Indeed leveraging systemd is the best path forward as we don’t want to reinvent being a cron and dealing with all the potential issues that come with it (retries, failures, logging, etc), that comes with it.
I’m pretty sure we can map our syntax into the native syntax of systemd as despite the possibilities of our syntax, systemd’s is still more complex and comprehensive. That said, even if that turns out to not be possible, we can still leverage systemd regardless by just implementing a simple test command that verifies if now is a good time to run a given timer, and then call it from systemd’s timer itself as a condition to run the actual command.
For the syntax, I think we can go simply with “timer” for the field name.
I’ve spent some time trying to figure out how to map our syntax to whatever systemd.timer(5) and systemd.time(7) support. I think we have a conceptual problem and only a subset of the supported syntax & functionality can mapped to systemd.
The main problem I see is that our syntax describes either discrete time points (eg. mon,14:00) or a time spans (eg. mon-wed,12:00-14:00). Then intention is that the action associated with the timer happens just once in the span (sub-span). For instance
mon,14:00 (actually mon,14:00-14:00) - the event happens once on Mondays at 14:00
mon,12:00-14:00 - once on Mondays between 12:00 and 14:10
mon-wed,12:00-14:00 - once on between 12:00 and 14:00 on Monday, Tuesday and Wednesday
12:00-14:00/2 (logically 12:00-13:00,13:00-14:00) once between 12:00 and 13:00, and again between 13:00 and 14:10
wed2 - once on the 2nd Wednesday, at 0:00
23:00-01:00 - daily, once between 23:00 and 01:00 the next day
23:00~01:00 - daily, once at randomly chosen time between 23:00 and 01:00 the next day
11:00-12:00 - if the event didn’t happen before and it’s 11:23, the even will happen now, next one on the next day at 11:00
For comparison, systemd.time syntax describes discrete time points only (assuming that we use OnCalendar=). The action happens on each event. To illustrate (note I’m trying to use similar times as above):
Mon 14:00 - the event happens on Mondays at 14:10
Mon 12..14:00:00 - the event happens on Mondays at 12:00, 13:00, 14:00
Mon 12..13:*:00 - there’s a an event each minute between 12:00 and 13:59 on Mondays
Mon 12..14/2:00:00 - even at 12:00 and another one at 14:00 on Mondays
Wed *-8..14 - once on the 2nd Wednesday, at 0:00
23..01:00:00 - does not parse
23:00 & RandomizedDelaySec=7200, at 23:00 + randomized delay between 0 and 2h
11..12:00:00 - if the event didn’t happen before and it’s 11:23, the even will happen at 12:00, next one on the next day at 11:00
I looked at the other [Timer] options, such as OnActiveSec, OnUnitActiveSec, but those don’t seem to be usable when mapping. OnActiveSec can be used to specify the delay from the last activation. OnUnitActiveSec is the delay from when the unit activated by the timer was last activated.
My feeling is that we will have to limit the timer services to specify timers that can be mapped to discrete events.
In other words:
support only single hour:minute (with exception, read on)
day spans though. spans wrapping around a week (actually all day spans) will be mapped to a list of days, eg. fri-mon -> Fri,Sat,Sun,Mon.
randomized time range, eg: 12:23~14:00 (I think it’s useful if the action would result in poking some remote machine)
TODO: is 10 minutes good enough granularity for ‘trying’ to run?
snapd generates timer data file under /var/lib/snapd/timer/<snap>.<app>.json, the contents need to include timer spec (end consumer does not need to parse snap info to avoid dependencies and binary bloat)
snapd ships a /usr/lib/snapd/snap-timer
snap-timer when run by systemd will parse data in /var/lib/snapd/timer/<snap>.<app>.json files, get the current timer spec, find out when the service last ran and start it if needed
runs snap run <snap>.<app> as a child process
TODO how about snap run --timer <snap>.<app> which in turn runs snap run <snap>.<app> ? (probably easier do in current code).
snapd generates a snap.<snap>.<app>.service file for the timer service, eg:
if the timer is an interval, snap-timer can stay ‘running’ for as long as it’s inside the ‘active’ time, eg. timer is 10:00-11:00, systemd starts snap-timer at 10:00, it can stay running until 11:00.
if current time is outside of the timer spec, snap-timer will exit immediately
runaway timer policy is do nothing, we can’t tell how long the service should run, so it’s probably best to avoid killing it, perhaps log a message
reporting is not much different from the usual systemctl status snap.<snap>.<app>, systemctl list-timers will show when the snap.<snap>.<app>.timer (so the actual service that does the gatekeeping) got activated last and when it will be run next
We probably don’t need that. If we use Persistent=true (which is a good idea either way) systemd will run once after a missed window, so we can simply schedule at the earliest time of the range (10:00 in the example) and expect systemd to call it on misses so we can check if we’re still inside the range.
We probably don’t need further data other than the time unit itself. Consider something like this:
ExecStart=snap run --timer <timer spec> ...
This would only run when the timer specification matches the current time, without any contact with the daemon or any other data file. For randomized windows, we can hardcode the random time inside the systemd timer itself. For example, for 10:00~12:00, when writing down the timer unit find a random time between 10:00 and 12:00 (minus some padding from the end), say, 10:37, and write that down in the systemd timer so it calls it at that time on that machine, until written again with a different time.
No need as well given those ideas, I think.
Given the strategy above, I’m hoping we can simply run the actual snap run command, and it bails silently if it’s not an appropriate time. Or perhaps if we can create good enough rules that wouldn’t fire improperly very often, we can even log something saying it’s being skipped.
Right, this will make it a bit easier. Assuming this approach, the time ranges will be mapped as follows:
10:00-12:00 => OnCalendar=10:00
OnCalendar=10:37 # minutes picked randomly when generating the timer
I already have the code for generating *.service and *.timer files along with some tests, so it will get small update.
I also implemented snap run --timer albeit loading the information about timers from persistent storage. This will be simplified to just parsing the spec passed in command line and checking if we are inside the range (or close enough if the schedule is a single time, eg: 10:00).
Generating OnCalendar is not done yet and I’m using a fixed, every 10 minutes, schedule (*-*-* *:0,10,20,30,40,50:00).
Generator for OnCalendar= entries. One niptick is that schedules such as mon1-tue2 are not representable, so we’ll have to count on the snap run --timer to not run the service if it falls outside of the range.
@mborzecki I think I found unwanted bug in error handling for timer feature.
So I have installed previously snap with timer definition, but this was in time when there was no timer support in edge channel. Now when timer support has landed I wanted to test it again, but I can’t uninstall/refresh this installed snap. Sure I can switch channel or revert back to core snap without timer support so I can uninstall, but error handling should be relaxed enough to allow snap removal as minimum.
I get following error:
error: cannot perform the following tasks:
Stop snap “xxxxx” services ([stop snap.xxxxx.xxxxx.timer] failed with exit status 5: Failed to stop snap.xxxxx.xxxxx.timer: Unit snap.xxxxxx.xxxxxx.timer not loaded.