Content snap dev tarball and staging packages

sitter · March 13, 2018, 1:01pm

Hey,

so I’ve noticed a bit of a problem with how content snap use currently works.

For KDE software we have a content snap which contains all our core frameworks (priming runtime files/libraries only). Out of the stage directory we then create a tarball which acts more or less as SDK to build against the content snap.

For example an app’s snap file might then look like this:

parts:
    kde-frameworks-5-dev:
        plugin: dump
        prime:
        - "-*"
        source: https://..../kde-frameworks-5-dev_amd64.tar.xz
    application:
        after: [kde-frameworks-5-dev]
        plugin: cmake
        stage-packages: [foobar]
        source: ...

This is mostly alright. BUT. If the application needs to stage additional packages this can easily end in a mess. If the files between the dev tarball and the staged packages are different.

For example consider the following chain of events:

March 1: content snap + dev tarball get built. dev tarball contains /libfoobar.so
March 2: libfoobar package update gets released
March 3: application snap wants to build with dev tarball and needs to stage package which depends on libfoobar. as a result libfoobar is also pulled in as well. the dev part gets staged and with it /libfoobar.so. when the app part gets staged snapcraft will error out since the /libfoobar.so staged by the dev part will be different from the /libfoobar.so staged by the application part (since it is older)

I’ve kinda hacked around the problem by implementing my own variant of stage-packages which can exclude packages. https://github.com/blue-systems/pangea-tooling/blob/master/nci/snap/plugins/x-stage-debs.py
This does seem fairly meh though, doesn’t work all that well for humans using it, and is an additional difference one has to think about when using the content snap.

Any thoughts on how to deal with this natively?

kyrofa · March 13, 2018, 5:41pm

Hey @sitter, thanks for bringing this up. @sergiusens and I had a chat about this today, and want to run a potential solution by you.

First, a few observations we used to arrive at our solution:

Given an SDK part that contains libfoobar.so, it rarely makes sense to include libfoobar.so in a part that consumes the SDK given that you want to build (and run) against what’s in the SDK
Given a tarball, there’s no reasonable way to extrapolate the Debian packages (if any) contained within
Snapcraft knows if you’re building in a situation like this because of the use of after

Here’s what we propose, fleshed out a little on my own so @sergiusens still has the right to disagree :

By default, it will continue working as it does today
Add support for excluding (negating) a fileset (rather than only supporting filesets that contain excludes)
For every part contained in after, Snapcraft will dynamically create a fileset of what the part placed in the staging area that is accessible from the YAML. We’ll need to convert hyphens into underscores for this, so in your example, perhaps $SNAPCRAFT_kde_frameworks_5_dev_FILESET
The consuming part then has the option of using this fileset in the stage filter so as to ensure it doesn’t include conflicting files/libs that are already provided by the SDK part.

With this proposal, your YAML could look something like this, and no longer have conflicts, just using the libs from the SDK:

parts:
    kde-frameworks-5-dev:
        plugin: dump
        prime:
        - "-*"
        source: https://..../kde-frameworks-5-dev_amd64.tar.xz
    application:
        after: [kde-frameworks-5-dev]
        plugin: cmake
        stage-packages: [foobar]
        stage:
          - -$SNAPCRAFT_kde_frameworks_5_dev_FILESET
        source: ...

What do you think? Note that this proposal will also affect @kenvandine et. al. (even though they may not realize it) so I’ll invoke him here as well.

sitter · March 15, 2018, 12:32pm

Seems like a really good approach to me.

What would be even better from a developer point of view is if snapcraft figured this out on its own. e.g. if a snapcraft_sdk.stamp exists in the tarball the part is considered an SDK and its contents by default gets pushed into the stage exclusion of all other parts that are ordered after it. The less a developer using an SDK has to be aware of the better IMO.

Mind you, since the proposed feature explicitly communicates what is going on to a casual reader of the snapcraft.yaml it may be better overall.

kyrofa · March 15, 2018, 5:58pm

Interesting idea, and definitely doable, although it makes me slightly uneasy. It makes this kind of invisible, which would make troubleshooting “why is this library I’m staging not the version it should be?!” kind of hard. Also, how would we actually support someone wanting to overwrite the library from the SDK? (Why would someone want to do that? Honestly no idea, so perhaps my argument holds no water, but we see people doing weird things all the time!) They would need to know it’s happening, and somehow opt-out of it, which is another feature we’d need to add.

Any further thoughts?

sitter · March 19, 2018, 10:46am

Right now we don’t support wanting to overwrite anyway, so unless you want to consider a hypothetical future feature I am not sure why it matters. (FTR: if one were to need an opt-out I’d say ‘inclusion always trumps exclusion’ so stage: [-/foo.so, /foo.so] would overwrite /foo.so in the stage on account of being explicitly staged; at that point it would not matter if we have injected implicit exclusions as the user explicitly saying “stage this!” would always result in the expected behavior of overwriting whatever else was staged already).

The invisibility definitely is a concern though.

Perhaps a simple line explaining the SDK override in snapcraft’s stdout might be enough to address this?

I mean, you can’t overwrite with a different file anyway. In the end what we are really missing is a message telling the user that they can’t. Right now this is the fatal error of “$file has been staged already and your new $file is different and we don’t allow that”. Something that might be easily communicated in a non-fatal manner by some output when an SDK part was detected and the automatic exclusion injection happens “Detected SDK part, all its files are being automatically excluded from staging in parts that depend on it yayadyada https://urlwithmoreinfo”. This way snapcraft does the right thing most of the time, and at the off chance the user is getting perplexed by it we inform them why overwriting is neither support nor makes much sense.

I expect this largely is a question of which is more important from a user experience POV for snapcraft: doing the right thing 99% of the time automatically, or not doing the wrong thing 100% of the time. Or in other words perhaps: does the author of the snap need to be aware of having to do the right thing 100% of the time, or do they need to be “made” aware in case they have a 1% corner case.