the clue as to what’s happening is actually in the snap list output I pasted there. gtk-common-themes is not installed. It is seeded, though. snapd just hadn’t gotten around to initialising it yet. gtk-common-themes is last in the seed.yaml (the list we get from germinate to construct the seeds is always in alphabetical order) so I suspect snapd is seeding things in the order they appear in seed.yaml - is that right?
I had a look in the journal, and found that snapd.seeded.service took 01:17m to start in this VM, which seems very long. So two questions / straw men
Is it known that the seeding process is slow on live ISOs specifically and are there any plans to improve that situation?
Would it be maybe possible somehow to make seeding ‘expose’ the snaps to the system all at once, once seeding has finished, so that you don’t see a partially seeded system as a user of it?
Or alternatively for snapd to work out an ordering here: the snaps do have a default-provider: gtk-common-themes so it feels like it should be possible to have found that in the seed.yaml and initialised it first.
One thing to keep in mind here is that this ordering problem could have been a lot worse.
While the downside of making gnome-calculator available before gtk-common-themes is that it launches with the Adwaita theme, if it was made available before the gnome-3-28-1804 snap it wouldn’t have launched at all. An error message would be written to the log somewhere, and the user would be left wondering why the application is broken.
the seed file is processed in order top to bottom, doing an alphabetical sort for it is the worst thing you can do. In ubuntu core we learned that a properly ordered seed (via required-snaps in the model file) can gain you massive boot speed improvements (here the seed file drectly influences the first boot since eveything is a snap) …
i’d suggest to seed with a properly ordered list instead .
snapd could have avoided this erroneous situation, so I’m basically asking if this kind of improvement could be considered. If the solution is rejected, then fine - we get to go and try to address it externally somehow. But that would be less robust since then it would be possible by a mistake or a bug to reintroduce this kind of error.
So, here, snapd.seeded.service took 1m7s. Looking at snap debug timings --last=seed and snap debug timings --last=become-operational, seeding took ~30s and getting a serial took about the same (these things run in parallel though, AFAIK).
What do the output of those two look like for you?
become-operational shouldn’t take that log, but here it is:
I think the erroneous situation here is that snapd seeded a snap without also setting up the default-provider snap from one of the snap’s content interface plugs. When installing a snap with snap install, snapd will first install the default-provider (and connect plugs/slots) before completing the installation of the original snap, so I think this same behavior should happen on seeding. In this case, snapd would still obey the same ordering of seeding snaps with the exception of if a snap declares a default-provider, then the seeding of that snap will seed the default-provider before finishing. This would still allow folks to manually craft the order of their seed.yaml as necessary, but unblock race conditions like this.
Note I think the behavior above with default-providers is already implemented for bases - if an app snap in the seed has a base and the base is ordered after the app snap in the seed, the base will get setup when the app snap is seeded.
8 Done 35875ms - Make snap "core" (7270) available to the system
otherwise a lot of the connections take ages to be made, which probably adds up. Also this confirms that gtk-common-themes was set up after the applications were.
There is a lot of opportunity here for improvement since snapd performs a snap-seccomp compile and apparmor_parser compile for each interface connection as snapd performs them, rather than batching them all in one compile each at the end. I believe addressing this is on the snapd team’s roadmap this cycle.
I can appreciate the POV that current behavior makes first boot too different from a sequence of installs. So we might change this, there are other corner cases fixes to do in first boot first though.
snapd needs to move its startup housekeeping out of the boot process altogether. See this, this, and this, each of which is a catastrophic failure attributable to this architecture.