Networkd fail to set ip address between leases if ip address changes

Hi there,
we are running ubuntu core on almost eleven thousand device with an our image based on armhf version of core.
We use the default network stack based on networkd + netplan.
We hit more cases where networkd fail to set up ip address for our device on our production.
We replicate the same problem with Canonical official image for raspberry.
It shows that it is a snap core problem which interests networkd.

Below steps to replicate the issue.

  1. Setup a dhcp server for lease of about some minutes (i.e 10 minutes).
  2. Boot the board and wait for get an ip from dhcp server
  3. Before the lease expires, set a reservation for a different ip address

Depending on lease duration before the lease expires( for 10 minute we have 2 minutes before ), networkd configure the new address in addition to the previous one.
When the lease expire both ip address ( the prevoius and the new one ) disappear from the interested network interface.
Depending on lease duration before the second lease expires ( for 10 minure we have 2 minutes before ) networkd configure only the new ip address on the network interface and the ping toward an outside host work properly.

During the test the dhcp server records correctly leases and their duration.

We check directly from console the network interface setting with the tool ip, checking continuously the value for ip address and valid_lft for the interested network interface.

Please note that if the ip address setting are the same between leases the problem doesn’t jump out.

Please note that if the ip address setting are different between lease the problem jumps out and it is very bad:

Typically the lease time on consumer router are about some day, then a board that change ip between lease loses the network connectivity for some day without a direct action.

We search log file for the problem but we have no success.

Because raspberry image is affected by this issue, it is very probable that other images shares the same issue.

Please, after a confirmation from your side of the issue, could you fix the problem or escalate the issue at the upstream project?

We are available for further testing.

Cheers,
Nicolino

1 Like

can you open a bug via:

https://launchpad.net/ubuntu/+source/systemd

the maintainer of systemd-networkd is not regular on this forum so launchpad is the better place since he surely will want logs/configs and such :wink:

@ogra I filed a bug report.

@NCuralli Thanks a lot for your problem report! I am setting up my pi2 now to reproduce and will look into it further.

Below syslog annotated with ip address state:

Oct 4 09:48:06 localhost rsyslogd-2007: action ‘action 11’ suspended, next retry is Wed Oct 4 09:49:36 2017 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Oct 4 09:49:25 localhost systemd-timesyncd[996]: Network configuration changed, trying to establish connection.
Oct 4 09:49:26 localhost systemd-timesyncd[996]: Synchronized to time server 91.189.89.199:123 (ntp.ubuntu.com).
Oct 4 09:49:26 localhost systemd[1]: Starting Update resolvconf for networkd DNS…
Oct 4 09:49:26 localhost systemd[1]: Started Update resolvconf for networkd DNS.
Oct 4 09:49:37 localhost rsyslogd-2007: action ‘action 11’ suspended, next retry is Wed Oct 4 09:51:07 2017 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Oct 4 09:51:09 localhost rsyslogd-2007: action ‘action 11’ suspended, next retry is Wed Oct 4 09:52:39 2017 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Oct 4 09:52:40 localhost rsyslogd-2007: action ‘action 11’ suspended, next retry is Wed Oct 4 09:54:10 2017 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Oct 4 09:54:11 localhost rsyslogd-2007: action ‘action 11’ suspended, next retry is Wed Oct 4 09:55:41 2017 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Oct 4 09:54:23 localhost systemd-timesyncd[996]: Network configuration changed, trying to establish connection.
Oct 4 09:54:23 localhost systemd[1]: Starting Update resolvconf for networkd DNS…
Oct 4 09:54:23 localhost systemd-timesyncd[996]: Synchronized to time server 91.189.89.199:123 (ntp.ubuntu.com).
Oct 4 09:54:23 localhost systemd[1]: Started Update resolvconf for networkd DNS.
Oct 4 09:55:43 localhost rsyslogd-2007: action ‘action 11’ suspended, next retry is Wed Oct 4 09:57:13 2017 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Oct 4 09:57:14 localhost rsyslogd-2007: action ‘action 11’ suspended, next retry is Wed Oct 4 09:58:44 2017 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Oct 4 09:58:46 localhost rsyslogd-2007: action ‘action 11’ suspended, next retry is Wed Oct 4 10:00:16 2017 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Oct 4 10:00:17 localhost rsyslogd-2007: action ‘action 11’ suspended, next retry is Wed Oct 4 10:01:47 2017 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Oct 4 10:01:48 localhost rsyslogd-2007: action ‘action 11’ suspended, next retry is Wed Oct 4 10:03:18 2017 [v8.16.0 try http://www.rsyslog.com/e/2007 ]

Oct 4 10:03:05 localhost systemd-networkd[623]: eth0: DHCPv4 address 192.168.5.124/24 via 192.168.5.1 -----> here begin two ip address state

Oct 4 10:03:05 localhost systemd-timesyncd[996]: Network configuration changed, trying to establish connection.
Oct 4 10:03:05 localhost systemd[1]: Starting Update resolvconf for networkd DNS…
Oct 4 10:03:05 localhost systemd[1]: Started Update resolvconf for networkd DNS.
Oct 4 10:03:15 localhost systemd-timesyncd[996]: Timed out waiting for reply from 91.189.89.199:123 (ntp.ubuntu.com).
Oct 4 10:03:16 localhost systemd-timesyncd[996]: Synchronized to time server 91.189.89.198:123 (ntp.ubuntu.com).
Oct 4 10:03:20 localhost rsyslogd-2007: action ‘action 11’ suspended, next retry is Wed Oct 4 10:04:50 2017 [v8.16.0 try http://www.rsyslog.com/e/2007 ]

Oct 4 10:04:23 localhost systemd-timesyncd[996]: Network configuration changed, trying to establish connection. ----> disappear any address

Oct 4 10:04:23 localhost systemd[1]: Starting Update resolvconf for networkd DNS…
Oct 4 10:04:23 localhost systemd[1]: Started Update resolvconf for networkd DNS.
Oct 4 10:04:51 localhost rsyslogd-2007: action ‘action 11’ suspended, next retry is Wed Oct 4 10:06:21 2017 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Oct 4 10:06:23 localhost rsyslogd-2007: action ‘action 11’ suspended, next retry is Wed Oct 4 10:07:53 2017 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Oct 4 10:07:54 localhost rsyslogd-2007: action ‘action 11’ suspended, next retry is Wed Oct 4 10:09:24 2017 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Oct 4 10:09:24 localhost rsyslogd-2007: action ‘action 11’ suspended, next retry is Wed Oct 4 10:10:54 2017 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Oct 4 10:10:55 localhost rsyslogd-2007: action ‘action 11’ suspended, next retry is Wed Oct 4

10:12:25 2017 [v8.16.0 try http://www.rsyslog.com/e/2007 ] ------> before this line interface eth0 don’t have address, after this line the interfaces get 192.168.5.124/24( the second lease adddress)

Oct 4 10:11:49 localhost systemd[1]: Starting Update resolvconf for networkd DNS…
Oct 4 10:11:49 localhost systemd[1]: Started Update resolvconf for networkd DNS.
Oct 4 10:12:25 localhost rsyslogd-2007: action ‘action 11’ suspended, next retry is Wed Oct 4 10:13:55 2017 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Oct 4 10:13:18 localhost systemd[1]: Started Session 23 of user domotz.

This should be fixed with the core snap in the edge channel now.

@mvo I test your advice (https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1721223) about setting up secondary promotion.
It fix the problem of disappearing ip address but I hit a problem about routing table: the FIB cleaning doesn’t work on our kernel.
We are investigating about this issue, but it doesn’t affect raspberrypi kernel.

1 Like