Pulling network-online.target as prerequisite target slows down starting services


#1

snapd puts network-online.target as prerequisite in the systemd units it creates. This in turn pulls systemd-networkd-wait-online.service, which calls /lib/systemd/systemd-networkd-wait-online.

This command waits for interfaces managed by networkd to be up, with a timeout of 2 minutes. In case any of those interfaces is not brought up, there will be a 2 minute delay for the service to be started.

Quoting https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/ :

It is strongly recommended not to pull in this target too liberally: for example network server software should generally not pull this in (since server software generally is happy to accept local connections even before any routable network interface is up), it’s primary purpose is network client software that cannot operate without network.

it looks like this target should not be usually needed. Note also that this target does not provide a lot of guarantees, as for instance in systems where the network-manager snap is installed, when it finishes there will not be any interface configured anyway, as networkd does not manage them.

I suggest to either remove this target or maybe substitute with one with softer requirements, like network.target.


#2

This can also get in the way when there is a default eth0 connection (like there sadly is hardcoded in core images to make them work in cloud environments) but you boot with the cable disconnected


#3

Thanks for looking into this! If we would remove the prereq on network-online, would existing snaps break?


#4

@mvo, some snaps may break, if they try to access the network right when starting the service, and they do not retry. But imho those snaps are already broken, as they would not work properly if, say, you start the service with the ethernet cable disconnected. So even today they will not work as expected some times.

Also, note that if the network-manager snap is installed, a service will not get connectivity even after waiting for network-online.target (unless it has been lucky and NetworkManager has already started). As we have not seen this sort of problems, my guess is that the number of snaps that could be affected if the dependency is removed is probably small.


#5

@abeato I think this is a good suggestion, however if we remove this, for images without NetworkManager, there’s no good way for a snap to declare this target dependency anymore. My vote would be to not remove this until it’s possible to declare such a dependency. Another alternative would be to make this configurable via a core snap config setting.

Note - this is related to the fact that snaps cannot declare dependencies on system targets or targets belonging to another snap.


#6

@awe the point here is that this default dependency does not make sense in most, if not all, of the cases. This is due to the fact that it does not really guarantee that you end up with network connectivity, and in any case a service that fails just because there is no connectivity right when it starts is buggy.

Note that with the dependency we can very easily add 2 minutes to the start-up time of a service. Simply starting a device with one of its ethernet ports unplugged would make this happen.

Because of this, I think that the dependency should be removed, regardless of whether snapd supports in the future configuring service dependencies.


#7

@abeato According to the systemd link you shared, NetworkManager can be configured to work with this target, however it’s doubtful this is supported by the network-manager snap.

Since we can’t guarantee that changing this won’t break existing snaps, my recommendation would be to put this under control of a core snap configuration setting for UC16, and then default the setting to false for UC18.


#8

@awe seems sensible to me. @mvo what do you think about this?


#9

I pushed a global version under https://github.com/snapcore/snapd/pull/5746 - it will unconditionally change the setting. We could try it in edge for a while. The risk seems to be small but making it conditional on the base (or a config setting) is certainly an option.