Snap service start ordering

Hi everyone,

We use conjure-up to apply an lxd profile [1, 2] right before we deploy Kubernetes.

Everything seems to be working fine until we reboot of the host. After the reboot snap services fail to start. If we manually restart each of the failing services they do come up. On the logs we get a message that I am not sure how to further investigate:

snap[1414]: cannot change profile for the next exec call: No such file or directory

The issue was reported first here.

Here is the snap version:

$ sudo snap --version
snap 2.26.14
snapd 2.26.14
series 16
ubuntu 16.04
kernel 4.4.0-1020-aws

A couple of questions: Has anyone seen this behaviour before? How would you further investigate the “cannot change profile for the next exec call” error?

Thanks

As shown on the issue linked a fix for this is to manually restart the daemons. This indicates that the service startup order can address this issue. In particular, if the generated systemd manifest stated that the service should start after lxd.service (if present), then the daemons would survive a reboot. A similar approach is to use systemd option to start the service at the end of the system start up (idle). This last option is rather ugly but it works.

excerpt from the IRC discussion around this (needs @niemeyer input) :

<Chipaca>       kjackal_: so I think we can talk about whether allowing snaps to specify arbitrary After= lines is reasonable
<Chipaca>       kjackal_: that's a day of discussion probably, but over a week away because our architect is on holiday
<Chipaca>       I don't think it'd break anything conceptually to allow snaps to say After=bananas
<Chipaca>       it's not a dependency, it doesn't pull in anything, it's just ordering
<ogra_> as long as the service is included in your snap at least
<Chipaca>       although it can break a system :-(
<Chipaca>       gah
<Chipaca>       it can break a system by creating a loop 
<kjackal_>      Chipaca: ogra_: After=?? would also need for Wants=???
<Chipaca>       kjackal_: why? all you want is After=
<Chipaca>       kjackal_: lxd is already being started, you just want to tweak the order 
<Chipaca>       anyway, it's for discussion with our architect, not for irc
<Chipaca>       (and it's not super-obvious that it'll work)
<Chipaca>       Wants= is super obvious to me that it's wrong for snaps 
<kjackal_>      Chipaca: sounds good Chipaca should we talk again towards the end of next week to setup a meeting?

If the snap requires an interface, should the logic just be for that interface to report it is fully functional before enabling/starting the dependents?

There is a need to add information about the order of start of services defined in a snap. The ordering information should map to After=... Before=... entries in autogenerated systemd *.service files.

The proposed syntax change in snap YAML is like this:

`name: wat
version: 42
apps:
 foo:
   daemon: forking
   start-after:
     - bar
     - zed
 bar:
   daemon: forking
   start-before: [foo]
 baz:
   daemon: forking
   start-after: [foo]
 zed:

We will validate if the services listed in start-after/start-before actually exist. Also, we’ll do some very simple dependency checks eg. foo has start-after: [bar] and bar has start-after: [foo] will cause an error. I do not think we should go as far as to verify the whole chain of transitive dependencies.

Given the example above, the generated *.service files will look like this:

# /etc/systemd/system/wat.foo.service
[Unit]
Description=Service for snap application wat.foo
...
After=.... wat.bar.service wat.zed.service

First part of the work updating YAML and app validation:

@mborzecki I’ve moved the messages into this topic as it covers exactly that same aspect. There are also some related points in this other topic that is relevant here.

About the proposal and the implementation, it sounds better to name these as simply after and before, without the start- prefix, so they are implying an order without implying the fact starting the services is being requested. Otherwise it gets pretty confusing because you can order one service after two others, but that doesn’t imply starting it with these services, and once we introduce the fields starts-with and runs-with, it will get even more confusing.

@cratliff this topic may be of interest to you.

1 Like

Sounds good to me. I’ll update the PR to reflect this on Monday.

I’ve updated https://github.com/snapcore/snapd/pull/4340. With these changes, the snap YAML will look like this:

name: wat
version: 42
apps:
 foo:
   daemon: forking
   after:
     - bar
     - zed
 bar:
   daemon: forking
   before: [foo]
 baz:
   daemon: forking
   after: [foo]
 zed:
   ...

Pushed the 2nd part of changes that updates how *.service files are generated

https://github.com/snapcore/snapd/pull/4357

stating the obvious but this will need snapcraft work as well to accept the syntax there

Thanks, good to know. This could do a lot to simplify how we are managing our services. There was a good list of other features though, like ‘with’ to be able to say to start with another service or event, but this is a good start. I should have done a better job at pushing that thread along. I’ll have to go review it to refresh my memory. I think we the main problem we had was nomenclature over the service features.

FYI, the snapd changes have all landed in the master now.

Hi,
Should this also order services startup on install?
It does not seem to work that way.
I have this snap file (https://github.com/syncloud/rocketchat/blob/master/snap/snap.yaml) and try to start rocket chat before mongodb, but startup order seem to be random.

@ribalkin is it possible to get that snap somewhere?

Build server does not publish it because install test is not passing but if you want I can publish snap.
I have few logs of that test like syslog:

One thing to note I am using a bit modified snapd to test some improvements like systemd start timeout and also custom store backend.

I will publish snap soon and post a URL here.

Here is the snap: apps.syncloud.org/apps/rocketchat_180201_amd64.snap

Looks like the snap requires a ‘platform’ snap to be installed as well. Is this something you can upload as well?

@ribalkin can you verify this please:

  • snap version is 2.31, rc1 or rc2 (don’t think there’s any other way of getting 2.31 atm other than --edge or --beta of core snap)
  • systemctl list-dependencies --before snap.rocketchat.server should list snap.rocketchat.nginx
  • systemctl list-dependencies --after snap.rocketchat.server should list snap.rocketchat.mongodb

Note that the after ordering does not imply systemd Requires, but rather it’s mapped to After. Similarly before is mapped to Before. Just for the record, even if a service is started, this does not imply that it’s usable in the sense that it’s actively listening on a port and accepting connections or somesuch.

If that ends up being too confusing I think it’s fair to consider an implied Requires in services that have after.