Snap seeding is slow / racy

After having fixed snaps on the current eoan (19.10) dailies, I noticed that the first launch of snaps was often in the wrong theme:

the clue as to what’s happening is actually in the snap list output I pasted there. gtk-common-themes is not installed. It is seeded, though. snapd just hadn’t gotten around to initialising it yet. gtk-common-themes is last in the seed.yaml (the list we get from germinate to construct the seeds is always in alphabetical order) so I suspect snapd is seeding things in the order they appear in seed.yaml - is that right?

I had a look in the journal, and found that snapd.seeded.service took 01:17m to start in this VM, which seems very long. So two questions / straw men

  • Is it known that the seeding process is slow on live ISOs specifically and are there any plans to improve that situation?
  • Would it be maybe possible somehow to make seeding ‘expose’ the snaps to the system all at once, once seeding has finished, so that you don’t see a partially seeded system as a user of it?
    • Or alternatively for snapd to work out an ordering here: the snaps do have a default-provider: gtk-common-themes so it feels like it should be possible to have found that in the seed.yaml and initialised it first.
1 Like

One thing to keep in mind here is that this ordering problem could have been a lot worse.

While the downside of making gnome-calculator available before gtk-common-themes is that it launches with the Adwaita theme, if it was made available before the gnome-3-28-1804 snap it wouldn’t have launched at all. An error message would be written to the log somewhere, and the user would be left wondering why the application is broken.

1 Like

the seed file is processed in order top to bottom, doing an alphabetical sort for it is the worst thing you can do. In ubuntu core we learned that a properly ordered seed (via required-snaps in the model file) can gain you massive boot speed improvements (here the seed file drectly influences the first boot since eveything is a snap) …

i’d suggest to seed with a properly ordered list instead .

snapd could have avoided this erroneous situation, so I’m basically asking if this kind of improvement could be considered. If the solution is rejected, then fine - we get to go and try to address it externally somehow. But that would be less robust since then it would be possible by a mistake or a bug to reintroduce this kind of error.

it would be interesting to find out why it took that long to seed (no, it shouldn’t take that long unless the hardware is really puny).

Edit1:How do I set SNAPD_DEBUG=1 before snapd is started, in the live image?

Edit2: break=bottom, and then echo SNAPD_DEBUG=1 >> /root/etc/environment.

  • Hit F6 at the boot menu (where you can select Try / Install)
  • Add break=bottom to the command line before the --- and press enter to boot
  • When you get into an initramfs shell, run echo 'SNAPD_DEBUG=1' >> root/etc/environment
  • Hit Ctrl-D to continue the boot

Then it should be in there for the live session. You can do the same trick to modify other files, e.g. adding systemd drop-ins.

1 Like

unfortunately the thing I just downloaded (from http://cdimage.ubuntu.com/daily-live/current/) still has a broken seed so I can’t see this. Where is the fixed one?

s/current/pending/, i.e. here.

1 Like

So, here, snapd.seeded.service took 1m7s. Looking at snap debug timings --last=seed and snap debug timings --last=become-operational, seeding took ~30s and getting a serial took about the same (these things run in parallel though, AFAIK).

What do the output of those two look like for you?

become-operational shouldn’t take that log, but here it is:

I think the erroneous situation here is that snapd seeded a snap without also setting up the default-provider snap from one of the snap’s content interface plugs. When installing a snap with snap install, snapd will first install the default-provider (and connect plugs/slots) before completing the installation of the original snap, so I think this same behavior should happen on seeding. In this case, snapd would still obey the same ordering of seeding snaps with the exception of if a snap declares a default-provider, then the seeding of that snap will seed the default-provider before finishing. This would still allow folks to manually craft the order of their seed.yaml as necessary, but unblock race conditions like this.
Note I think the behavior above with default-providers is already implemented for bases - if an app snap in the seed has a base and the base is ordered after the app snap in the seed, the base will get setup when the app snap is seeded.

1 Like

Yes indeed, that is exactly what I was talking about - thanks for explaining it more clearly.

become-operational for me seems nowhere near as slow:

ubuntu@ubuntu:~$ snap debug timings --last=become-operational
2019/07/16 15:49:38.751216 cmd_linux.go:224: DEBUG: restarting into "/snap/core/current/usr/bin/snap"
ID   Status        Doing      Undoing  Summary
98   Done          350ms            -  Generate device key
 ^                 343ms            -    generating device key pair
99   Done          285ms            -  Request device serial
 ^                 239ms            -    get device serial
  ^                152ms            -      prepare device serial request
  ^                 87ms            -      submit device serial request
 ^                  30ms            -    fetch signing key chain

I’ve uploaded snap debug timings --last=seed separately since it is huge; the only thing that sticks out to me as being singularly large is

8    Done        35875ms            -  Make snap "core" (7270) available to the system

otherwise a lot of the connections take ages to be made, which probably adds up. Also this confirms that gtk-common-themes was set up after the applications were.

And I’ve uploaded the journal too. You can see

Jul 16 15:46:45 ubuntu systemd[1]: Starting Wait until snapd is fully seeded...

and

Jul 16 15:48:15 ubuntu systemd[1]: Started Wait until snapd is fully seeded.

to indicate how long it took.

Because of the plug/slot auto-connect symmetry seeding doesn’t sort default providers first atm.

Is there any reason why it couldn’t or shouldn’t be changed to do so?

There is a lot of opportunity here for improvement since snapd performs a snap-seccomp compile and apparmor_parser compile for each interface connection as snapd performs them, rather than batching them all in one compile each at the end. I believe addressing this is on the snapd team’s roadmap this cycle.

I can appreciate the POV that current behavior makes first boot too different from a sequence of installs. So we might change this, there are other corner cases fixes to do in first boot first though.

snapd needs to move its startup housekeeping out of the boot process altogether. See this, this, and this, each of which is a catastrophic failure attributable to this architecture.