Use a separate manifest file, or save everything in snap/snapcraft.yaml

elopio · June 27, 2017, 3:43am

Hello!

We have done some good progress to record details from the snapcraft.yaml and the environment where it is build, that will be stored in the resulting .snap for auditing. This post contains a good part of the journey:

So far, we are adding annotations to the source snapcraft.yaml file and writing it to prime/snap/snapcraft.yaml. Everything in prime ends up in the .snap, so if you unpack the snap you will be able to inspect what happened during the build, and even re-build it.

During our sprint last month we had a little discussion about the name of this file, and some other details. To start, we have to choose if we want to put everything about this build manifest in prime/snap/snapcraft.yaml, or if we want to make it a new file in prime/snap/manifest.yaml.

At the end of the discussion, @Saviq and @sergiusens were inclined for the manifest.yaml option (please correct me if I’m wrong). But there are pros and cons of each option, so before implementing it we wanted to open the discussion for a wider audience.

If we put all the manifest data in an annotated snapcraft.yaml, then there is no change required in snapcraft to rebuild. You unpack the snap, go to the unpacked directory and run snapcraft in there. The problem is that this annotated file needs to include additional information that we don’t want to have in the source snapcraft.yaml, like the version of the snapcraft and core snaps installed in the build machine. So, the schema for allowed values and keywords of the source file will be a subset of what’s allowed in this snapcraft.yaml that stored inside the .snap. Then we have two files with the same name but different validation rules.

There were some reasons to not include information like snapcraft-version in the source snapcraft.yaml, like keeping it agnostic to the environment where it’s build as much as possible. But well, there’s always an option to reevaluate that too. It could be possible to ignore or warn about those values, and just take them into account (if possible) when running a cleanbuild.

But let’s consider the other option. If we put all the manifest info about the build in prime/snap/manifest.yaml we don’t have to worry that much about the schema for the source yaml and the object yaml, they will just be different files and different schemas. We could just put the valid annotated snapcraft.yaml inside the snap, and put all the additional information in the manifest.yaml. Or just copy the source snapcraft.yaml, and put all the annotations and extra data in the manifest.yaml.

The thing with this second option is that now we have two files for the rebuild. So for reproducible builds (or at least similar builds) we can’t just call snapcraft, we would have to modify snapcraft to accept a command like snapcraft from path/to/manifest.yaml.

As always with software, everything is possible. We please leave your comments here, to make sure we make a good decision in the end.

pura vida.

jamesh · June 27, 2017, 8:45am

The main benefit I could see for including the information in the information in the shipped snapcraft.yaml file is if those annotations were understood by Snapcraft itself, putting it in a mode where it will only install those fixed package versions or checkout only those revisions from version control.

That’s probably beyond the scope for what you’re working on right now, but if you do go the snapcraft.yaml route it would be good to keep in mind.

elopio · June 28, 2017, 4:41pm

Hey @jamesh, thanks for replying.

The kind of annotations you mentioned are currently valid in snapcraft.yaml. So for example, you could specify a source-commit, in build-packages you could ask for the installation of hello=1.0.3, or in python-packages the same.

The kind of annotations that are currently out of the scope of snapcraft.yaml are more related to the environment where it is run. core version, snapcraft version, lxc container image hash.

kyrofa · July 5, 2017, 2:30pm

@niemeyer this was one of the topics we ran out of time to discuss at the sprint. Can you please take a look?

kyrofa · July 6, 2017, 4:37pm

@niemeyer and I just had a chat about this. Here are my notes from the meeting (please correct me if I misinterpret):

Real byte-for-byte reproducible builds seem an incredibly difficult task. Is this really something for which we should be striving? Perhaps a best-effort would be better (e.g. package versions don’t always stay in the archive).
When you use snapcraft in the typical sense (using a snapcraft.yaml), one expects it to error out if something isn’t possible
If trying to re-build a snap is indeed a best effort, then having a separate command that DOESN’T error out if something isn’t possible (e.g. a stage-package version is no longer in the archive) makes sense
The above combined with the fact that bloating the supported format of snapcraft.yaml seems like a bad idea implies that the best path forward is using the manifest.yaml. @niemeyer liked that file name, as well.

niemeyer · July 7, 2017, 1:01am

When we were discussing this I was thinking of a different file and flag instead of a different command (something like --redo=manifest.yaml or similar).

kalikiana · July 7, 2017, 2:32pm

Do you have examples of where that’s the case? So far the asset tracking is only using existing properties from the schema that are valid in snapcraft.yaml.

kyrofa · July 7, 2017, 3:25pm

@kalikiana, @elopio gave a few examples above: core version, snapcraft version, etc.

niemeyer · July 7, 2017, 4:45pm

@kalikiana Even if valid, there’s a distinction between a hand-built snapcraft.yaml and a manifest reporting what was used during build. One contains orders that were requested to the system, while the other contains a report of what was done with those orders. When redoing a build operation, it may be acceptable to ignore some of the content that cannot be reproduced from the manifest, but it will never be acceptable to ignore what’s in snapcraft.yaml. It’s also interesting to tell what was requested by the developer vs. what the system did.

For that sort of reason, independent files just feels fundamentally better.

elopio · July 14, 2017, 3:27pm

This one renames the recorded file to manifest.yaml:

https://github.com/snapcore/snapcraft/pull/1406

This one makes sure that we also record the original:

https://github.com/snapcore/snapcraft/pull/1407

Ads20000 · September 26, 2017, 9:17pm

What did those commits do? Is manifest.yaml still WIP or should this issue be closed?

elopio · September 27, 2017, 4:06pm

No, that card is for the inverse task. Now that we have recorded information in manifest.yaml, we need to use that as a source to rebuild a snap.