Snap build transparency and trust

chipaca · October 27, 2019, 3:11pm

@casey are you saying there are packages that have been migrated from deb to snap that have lost this reproducibility?

casey · October 27, 2019, 4:17pm

@chipaca That would depend on two questions:

Is there is a standard way to retrieve the manifest used to produce the snap, or is the manifest embedded in the snap?
Does the manifest always contain all the information needed to build the snap? At least as of last year, the manifest sometimes omitted important properties of the build environment, such as which repositories were configured. For example, refer to YAMLs for snaps built using PPAs

chipaca · October 27, 2019, 7:42pm

no, it does not depend on those questions. Please don’t derail this topic any further.

Reynolds5 · October 28, 2019, 1:52pm

We aren’t talking about revealing the source code but about packaging process. Build manifest doesn’t contain any sensitive data on its own. This is what flatpak does.

Yes, if you replace transparency with black-box which is only opened voluntarily when you gently ask for it then reproducibility is inherently lost.

chipaca · October 28, 2019, 2:15pm

can you tone down the hyperbole in favour of being clear? What, exactly, are you saying? I don’t get it.

Reynolds5 · October 28, 2019, 3:46pm

I’m saying that snaps aren’t reproducible when they don’t reveal neither what source was used to build them and how they were build.

ogra · October 28, 2019, 6:06pm

how does access to the snapcraft.yaml help you in any way if the snap pulls sources from git://secret.company.server/myaccount/code/tree/foo and then calls git apply -av awesome-patch-from-other-department.patch in an override-build (the patch wouldnt be accessible without the source tree at all) ?

snaps are delivering binaries … if a developer decides to provide the source and snapcraft.yaml this is great, but there are many many commercial snaps in the store, there are lots of companies building IoT businesses around snaps in brand stores … some of them develop in the open, some dont … for the general opensource developer there is an option to provide his snapcraft.yaml and even use github based auto-builds of his snaps and we should definitely encourage this… but we can not enforce the developer behaviour onto people building products around snaps.

Reynolds5 · October 28, 2019, 8:46pm

Let me reverse the question - how does it harm developer? For proprietary soft it would be minimally useful. For open source soft it will be totally useful. Transparency isn’t necessarily against proprietary soft as you implied. They can live peaceful alongside others and don’t require hiding everything behind iron curtain.

apt is also delivering binaries. As for debian/ubuntu repos build and packaging is fully transparent and there was done tons of work to make them reproducible. Why snap should be worse than that? Why go back dark ages and throw https://reproducible-builds.org/ out of the window?

This should be opt-out not opt-in. I believe vast majority of developers have nothing to hide in their manifests. Let those who have figure out how to hide it.

For your store you absolutely can set whatever rules you like which developers would have to obey. Just like Apple and Google do for their stores. Whether it ends up being transparent or blackbox is matter of choice. Your choice.

Sarke · October 28, 2019, 9:05pm

Commercial and proprietary snaps are not really a concern to me. They will most likely have a verified account, and we will know to trust jetbrains, microsoft, whomever.

The problem I see is when a non-verified 3rd party account submits an open-source package. Take the example I posted above: docfetcher. I know that I can probably trust docfetcher since it’s open source and the code is open for scrutiny by the community. But can I trusty Vasili from Russia who added the snap and didn’t include the build script?

The chain of trust between the open-source docfetcher and the snap is broken because we cannot see if the snap was built directly from the docfetcher source, or if there were any extras or changes included.

chipaca · October 28, 2019, 10:50pm

part of what we’re doing is providing the tools and the means of asking and answering that question. Can you trust an unverified developer that has chosen to not include the manifest in their snap? If not, then don’t use their snaps.

have you raised this concern with the author of the snap? It’s probably a trivial change for them to include the manifest.

Sarke · October 28, 2019, 11:29pm

@chipaca I think you are missing the point completely. This isn’t about the docfetcher snap, it is just one of many snaps that can be used as an example.

What I am asking for is for snapcraft to provide a clean indication and transparency that the snap is what it claims to be.

Having to ask each developer individually, for each user that cares to do so, then you might as well not bother.

But if you want snapcraft to be a place where the snaps can’t be trusted by default, then yeah, don’t provide that transparency.

Other providers like Docker and flatpak are able to do this. Why can’t snapcraft?

chipaca · October 28, 2019, 11:34pm

we could look into exposing whether a snap provides a manifest. Is that what you’re asking for?

Sarke · October 29, 2019, 5:30am

@chipaca see my earlier comment in this thread. Basically it should be easy to see that the snap was built using the source it claims without any shenanigans in the middle.

For me, this means showing the manifest.yml, snapcraft.yml, and build logs on the official store page of the snap (separate tab or link).

galgalesh · October 29, 2019, 11:46am

I agree with the idea to make it easier to find how a snap was built.

If known, the store should link to the source repository from which the snap was built. At the very least, all snaps built by build.snapcraft.io should have a link to the source repository. This will make it a lot easier for the community to help maintain snaps and fix snaps. I’ve encountered the issue quite often that I want to fix an issue with a snap, but I’m having a hard time finding the source of the snap.
build.snapcraft.io should automatically add the snapcraft.yaml file to the snap. This shouldn’t pose any security issues since, afaik, it only supports building from public repositories.

However, this does not improve trust in any way. Thinking that is does is dangerous because it gives you a false sense of security. A malicious snap publisher can very easily create a malicious snap and include a different snapcraft.yaml file.

You always have to trust the publisher of the snap. If you don’t, you should not install the snap.

Reynolds5 · October 29, 2019, 2:02pm

If snap is build on trusted server like build.snapcraft.io from snapcraft.yaml which is automatically attached then how developer could switch it to something malicious?

I ask again - from where this trust can come from if not from transparency? The only thing you see in store is publisher name which can be anything and optional “verified” badge, whatever it means. Except for households names like “Canonical” or “KDE” this means next to nothing for most users.

I see the popular answer here is “don’t use snaps”. Indeed this is what most people do - they don’t use snaps. I’m surprised that snap developers prefer this instead of admitting that lack of transparency is problem to fix while many similar to snap platforms already fixed it.

galgalesh · October 29, 2019, 6:20pm

Why forcing developers to include `snapcraft.yaml` does not create trust

You have to think like an attacker here. Let’s assume this fictional system is in place so that the snapcraft.yaml file itself can’t be spoofed. How would you try to break this system? Below I explain one very simple “attack” that could happen.

First create a snap with non-malicious builds. Then, after a while, you change the snapcraft.yaml so that it inserts your malicious code. Since snaps update automatically, users will get the malicious code very quickly. This kind of “attack” has happened a couple of times in npm and there is no technical solution that can prevent this from happening.

Note that this kind of attack is not specific to snaps. You have the exact same issue with deb packages. You need to trust the publisher of the deb packages not to do this. You also need to trust the publisher of the deb packages that the provided sources are the exact sources used to build the software. Technically, there is nothing stopping a publisher from uploading different sources.

What many people have told you is “Don’t use the snaps of publisher X if you don’t trust publisher X.” If you only trust Canonical, then you should only use the snaps of Canonical. This is no different from Debian packages. The packages in the main repository are published by Ubuntu. If you don’t trust Ubuntu, you should not use those packages. The packages in ppa’s are published by third parties. If you don’t trust a certain third party then you should not use that ppa. Whether that package is a deb or a snap, it makes no difference.

The Snapcraft store actually adds an additional tool for you to figure out who to trust: the “verified” checkmark. This is something that does not exist in deb packages. I could create a ppa published by “Netbeans Official Packages” and users have no way of knowing if that account is actually owned by Netbeans or not. However, in snapcraft, if the account has a verified checkmark, you know that the account is owned by who it claims to be. The KDE account is owned by the KDE project, and not some random stranger on the internet.

Why this is not fixed on other platforms

You keep saying things like this without providing any evidence. This is not fixed by the Debian reproducible builds initiative, for example. I think you misunderstand what that initiative does. The project’s goal is to make builds of Debian packages deterministic so that every build of the same sources creates the exact same package. This way users can verify if a package is what it claims to be. The trust comes from users rebuilding the package locally and then verifying if that is the package they downloaded from Debian. Transparency does not create trust in this case. reproducibility creates the trust because users can verify the package.

Also note that this Debian project is entirely voluntary, many packages in Debian are reproducible because volunteers bugged the developers and packagers. To quote the project

Reproducible builds of Debian as a whole is still not a reality, though individual reproducible builds of packages are possible and being done.

How reproducible are snaps?

Let’s follow the guidelines of reproducible-builds.org.

How?

First, the build system needs to be made entirely deterministic: transforming a given source must always create the same result. For example, the current date and time must not be recorded and output always has to be written in the same order.

Second, the set of tools used to perform the build and more generally the build environment should either be recorded or pre-defined.

Third, users should be given a way to recreate a close enough build environment, perform the build process, and validate that the output matches the original build.

Let’s break this down and look at the issues

First, the build system needs to be made entirely deterministic: transforming a given source must always create the same result. For example, the current date and time must not be recorded and output always has to be written in the same order.

This is something that Snapcraft can improve on; the build time is recorded in a bunch of different places. However, apart from metadata, a given source should always give the same result because the build is done in a clean environment. Note that if the compiler includes timestamps in binaries, that they are still not reproducible, but that is not a Snapcraft issue. Debian has the exact same issue.

Second, the set of tools used to perform the build and more generally the build environment should either be recorded or pre-defined.

This is something snapcraft does extremely well imo. manifest.yaml and snapcraft.yaml contain all the information about the build. This is, just like with Debian packages, completely optional for a publisher. However, I just checked this and it seems that build.snapcraft.io already automatically includes the manifest.

Third, users should be given a way to recreate a close enough build environment, perform the build process, and validate that the output matches the original build.

The first step is very easy with snapcraft. You download the source repository and run snapcraft. This is a lot easier than trying to rebuild a reproducible Debian package. This is easy thanks to the hard work of the Snapcraft developers. This improved a lot in the past few months and years. Some examples of the great work the Snapcraft team did.

Switch to multipass builds by default so that the build host is exactly the same, no matter which distro or version you use.
Remove support for remote parts, so that snapcraft.yaml contains the entire build declaration.
Switch to extensions, so common components are declaratively defined and are locked to the snapcraft version.

All this work just to make sure that running snapcraft will always build the snap. Ofcourse, snaps that don’t use bases still use the old harder-to-reproduce build process, but that method is deprecated and Snapcraft does a lot to push publishers towards using bases.

Third, users should be given a way to recreate a close enough build environment, perform the build process, and validate that the output matches the original build.

Now this last part is a bit harder to do with snapcraft. In order to validate a snap package, you’d need to unsquash the package and diff every single file. This is difficult, but this is still possible.

Conclusion

Please keep in mind that this is a technical forum to discuss these things with the developers. Using hyperbole or unspecific claims like “others have already fixed this” does not help the development of Snapcraft.

Saying things like

Why go back dark ages and throw https://reproducible-builds.org/ out of the window?

Might get you upvotes on reddit but it’s actually not helpful for a discussion on a development forum.

As an example as to why it’s not helpful; it doesn’t actually tell us what issues you want fixed. Do you mean that you want Snapcraft to stop including the build time in the metadata? Do you want snapcraft to automatically use specific compiler flags so that binaries don’t include compile time metadata? Or do you want more developers to include the build manifests in the snap? As I explained, Snapcraft is already doing a lot more than the debian packages and build tools, so in what way do you think this issue regressed?

ogra · October 29, 2019, 6:51pm

Just a little side note… using snapcraft is not actually a requirement at all … you can create the snap structure completely by hand and call snap pack (which is essentially just mksquashfs)…

Sarke · October 30, 2019, 3:23am

We realize that eliminating malicious intent is not completely possible, but with enough transparency it will greatly limit the bait-and-switch attack you mentioned, and the time until it is discovered. Just because it’s not a completely perfect solution doesn’t mean it’s not worth implementing.

jamesh · October 30, 2019, 9:28am

I think his point is that if a snap contains a snap/manifest.yaml file, you could attempt to follow those instructions to produce your own binaries and compare them with what was in the snap. There are a few roadblocks though:

I don’t think you can get Snapcraft to directly build from a manifest and use the locked versions of deb or snap packages used in the build. For snaps in particular, you might not even have access to the locked revision if new versions have been published to the channel.
The manifest is not (yet) complete. It doesn’t currently record information about the repository containing the snapcraft.yaml file, let alone the revision. This is a problem if you have parts whose source points within this repo (e.g. for extra scripts or build tools).
The build instructions won’t necessarily result in a reproducible build, so the binary isn’t identical.

With those caveats, it might still be enough to determine that the binaries definitely don’t correspond to the provided manifest in egregious cases.

As far as trust goes, I think a better option is to instead put trust in the sandbox. This also involves examining the interfaces the snap wants to plug: do they make sense for what the application claims to do? Do any of those interfaces offer broader access than the application should really need?

Reynolds5 · October 30, 2019, 6:05pm

Yes this can be done however with transparency it can be detected without reverse-engineering snap binaries.

if I trust only canonical then I can install only a few snaps in comparison to thousands of packages from ubuntu repos. This is huge roadblock for snaps adoption and the reason that making snaps more trustful is important.

Do you know how exactly that checkmark is granted?

Sorry but I provided evidence from flahub where each flatpak is build from manifest hosted on github repo and available for everyone. @Sarke provided similar evidence from docker. You can’t simply chery-pick one comment and ignore everything that was said before and I won’t copy-paste same examples in every comment.

The point is you can’t even check reproducibility if neither source or build/packaging manifest is available. Availability of those doesn’t make something reproducible but opens possibility to make it reproducible in future the same way as it was gradually done for debian and others.

See above. You got two specific examples from two people and it seems you didn’t even noticed. On the other hand hyperboles seems to raise attention.

I want the same thing as @sarke who opened this discussion - build manifests available for each snap ideally linked from store which will allow inspecting it before installing snap.

I disagree that it does a lot more. Build manifests are only available in certain circumstances and without them it’s impossible to tell what source of particular snap is and how it was build.

Sandbox is useful in some circumstances and useless in others. Malicious email app can steal very important data without touching sandbox boundary. Also you already had coin miner malware in snap store against which sandbox was ineffective but build transparency could be.

Snap build transparency and trust

Why forcing developers to include snapcraft.yaml does not create trust

Why this is not fixed on other platforms

How reproducible are snaps?

Conclusion

Why forcing developers to include `snapcraft.yaml` does not create trust