Ubuntu Core 20 boot delayed by missing Wi-Fi USB dongle despite "optional: true" netplan config

I have a fleet of Ubuntu Core 20 devices with the following netplan config (shortened for clarity):

network:
  ethernets:
    enp1s0:
      dhcp-identifier: mac
      dhcp4: true
  version: 2
  wifis:
    wlx0123456789ab:
      access-points:
        MySSID:
          password: ***
      dhcp-identifier: mac
      dhcp4: true
      optional: true

Wi-Fi device wlx0123456789ab is a USB Wi-Fi dongle. Despite the optional: true config, Ubuntu Core 20’s boot process is delayed by about 1 min 30 secs as it waits for the USB dongle to become available. Boots with the USB dongle attached don’t incur this delay.

Isn’t this delay exactly what optional: true is meant to circumvent?

List of potentially relevant snaps:

bare                           1.0                         5      latest/stable  canonical✓         base
core                           16-2.52.1                   11993  latest/stable  canonical✓         core
core18                         20211015                    2246   latest/stable  canonical✓         base
core20                         20210928                    1169   latest/stable  canonical✓         base
pc-kernel                      5.4.0-89.100.1              845    20/stable      canonical✓         kernel
snapd                          2.52.1                      13640  latest/stable  canonical✓         snapd

This is potentially related, but marked as “fix released” back in 2018: https://bugs.launchpad.net/snappy/+bug/1619258

There may be a difference between waiting for a device, and waiting for a “device to become configured”.

I’m not sure if this will work, but you could try a match rule instead, and if there is no matching name it should just skip it entirely.

~~~8<~~~
  wifis:
    wifi:
      match:
        name: wlx*
      set-name: wifi
      access-points:
        MySSID:
          password: ***
      dhcp-identifier: mac
      dhcp4: true
      optional: true

Cheers, Just

Thanks for the suggestion! However, netplan’s networkd back-end doesn’t support match-based rules for Wi-Fi devices so I’m bound by the ID-based matching used in my original post. Additionally, optional: true is only supported by networkd so I wouldn’t expect this to be a problem.

From the docs (emphasis mine):

An optional device is not required for booting. Normally, networkd will wait some time for device to become configured before proceeding with booting. However, if a device is marked as optional, networkd will not wait for it. This is only supported by networkd, and the default is false.

Hey, I was able to reproduce this timeout in a classic Ubuntu Focal VM. The optional: true is a valid flag, but it only handles the waiting for the network to come up and be configured. The timeout we’re observing here seems to come from the netplan-wpa-wlx0123456789ab.service unit waiting for the sys-subsystem-net-devices-wlx0123456789ab.device hardware to become available, though, but that does not exist if the dongle is unplugged.

Would you mind reporting a bug about this situation at https://bugs.launchpad.net/netplan/+filebug?

According to https://github.com/systemd/systemd/issues/4413 we might need to change the Requires=sys-subsystem-net-devices-wlx0123456789ab.device dependency to something like Requisite=sys-subsystem-net-devices-wlx0123456789ab.device (I’m not sure about the side-effects, tho).

Reproducer:

root@ff-vm:~# netplan get
network:
  ethernets:
    enp5s0:
      dhcp4: true
  version: 2
  wifis:
    wlx0123456789ab:
      access-points:
        MySSID:
          password: XXXlksjdf091234
      dhcp-identifier: mac
      dhcp4: true
      optional: true
root@ff-vm:~# time netplan apply
A dependency job for netplan-wpa-wlx0123456789ab.service failed. See 'journalctl -xe' for details.
Traceback (most recent call last):
  File "/usr/sbin/netplan", line 23, in <module>
    netplan.main()
  File "/usr/share/netplan/netplan/cli/core.py", line 50, in main
    self.run_command()
  File "/usr/share/netplan/netplan/cli/utils.py", line 310, in run_command
    self.func()
  File "/usr/share/netplan/netplan/cli/commands/apply.py", line 58, in run
    self.run_command()
  File "/usr/share/netplan/netplan/cli/utils.py", line 310, in run_command
    self.func()
  File "/usr/share/netplan/netplan/cli/commands/apply.py", line 250, in command_apply
    utils.systemctl_networkd('start', sync=True, extra_services=netplan_wpa + netplan_ovs)
  File "/usr/share/netplan/netplan/cli/utils.py", line 177, in systemctl_networkd
    subprocess.check_call(command)
  File "/usr/lib/python3.8/subprocess.py", line 364, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['systemctl', 'start', 'systemd-networkd.service', 'netplan-wpa-wlx0123456789ab.service', 'netplan-ovs-cleanup.service']' returned non-zero exit status 1.

real	1m31.108s
user	0m0.392s
sys	0m0.051s
# journalctl -e
[...]
Nov 18 09:17:18 ff-vm systemd[1]: sys-subsystem-net-devices-wlx0123456789ab.device: Job sys-subsystem-net-devices-wlx0123456789ab.device/start timed out.
Nov 18 09:17:18 ff-vm systemd[1]: Timed out waiting for device /sys/subsystem/net/devices/wlx0123456789ab.
Nov 18 09:17:18 ff-vm systemd[1]: Dependency failed for WPA supplicant for netplan wlx0123456789ab.
Nov 18 09:17:18 ff-vm systemd[1]: netplan-wpa-wlx0123456789ab.service: Job netplan-wpa-wlx0123456789ab.service/start failed with result 'dependency'.
Nov 18 09:17:18 ff-vm systemd[1]: sys-subsystem-net-devices-wlx0123456789ab.device: Job sys-subsystem-net-devices-wlx0123456789ab.device/start failed with result 'timeout'.

Thank you so much for the insight @slyon! I’ve finally gotten around to reporting this on the Netplan Launchpad project here: https://bugs.launchpad.net/netplan/+bug/1906646. Turns out there were 2 existing reports so I continued from the oldest one and marked the other as a duplicate.

Thanks again!