I tried searching on the internet for answers, but could not find clear cut answers that I could understand.
(I do not know if I have selected the correct forum for these questions.)
I see on one of the Snap store pages a discussion telling developers of snap packages that “someone”, somehow, is collecting a bunch of data from people who install snap packages, and that this data can be reported to the developers in something like a dashboard report. Do all snap packages, by design, collect and report information to some entity (Canonical)? If so, how do I stop that collection from happening.
Are snap developers able to build in additonal data collection in their snap apps beyond that which Canonical built in to the core elements of all snap packages? If so, how do I stop that collection from happening?
How do I learn about the data collection and general privacy practice of each snap package? I looked on the snap store at a few random packages, I didn’t find anything that told me what, if any, data is collected.
How does the data collection and privacy practice of snap packages compare to flatpak packages?
My concern, as a layman, is that I don’t want applications collecting and sending information about my computer, etc. I am just now realizing that snap packages may be doing this. (I realize that the same problem may also exist with debian packages, but my mind is not put at ease by the thought of, Well, snap packages are no worse in this regard than debian packages.)
There are no privacy concerns here because nothing private is being collected. Snap record someone somewhere in a general region installed a package version on some distribution. That is all. Open any random website and you’ll give far more information than that.
This little information that is being collected helps developers create better snaps. I personally would like a bit more information to be collected such as model of graphics card.
Not all data collection is bad and you have to look at intent behind data collection.
The Snap Store collects statistical information about what snaps are installed, and which distribution they’re installed onto. These statistics also counts a Snap package’s installations by region so that a developer may see that they have “50 active installations in the UK” for example, which helps them to prioritise localisations for specific regions. These statistics are all anonymous. You may delve into the Snapd source code to find the relevant reporting to determine exactly what is sent.
There is nothing preventing a Snap package’s author from including any reporting that they desire. However, they will be limited on discovering information about your system and installed packages due to the confinement mechanisms of the Snap system.
I don’t know what information flathub collects, but third-party repositories might collect different data. There is no centralised location for acquiring flatpaks so there is no centralised entity to query as to what data is being collected. You’ll need to contact each flatpak repository owner to see what their individual data collection practices are.
Like Snap packages, flatpaks do not limit the intentions of a package’s author in regards to data collection. Both Snaps and Flatpak provide limits in the form of permissions that a user must grant for some information about your system to be accessible to the package.
The metrics provided to snap developers are based on downloads, and are provided in aggregate form rather than anything that can be tied to individual machines.
When downloading snaps, an automatically generated machine serial number is sent, which is used for for things like gradual roll-outs of new releases (e.g. have 10% of your users upgrade to the new version, wait a while for bug reports before having more users upgrade). A User-Agent header is also sent in the download requests, which is used to provide the aggregate stats about what distro’s are used by a snap’s users.
There is no facility to have snapd collect additional metrics on behalf of a snap, and strict confined snaps can not see the machine serial number. With that said, it’s some snaps may implement their own telemetry/metrics collection (e.g. firefox collects some simple metrics by default). There’s nothing inherent about snaps that enables that collection though, and the strict confinement sandbox arguably reduces the amount of data an app could harvest.
Am I understanding you correctly: The information is collected only when the snap package is being downloaded? In other words, after the snap package is downloaded, during the life of use of that snap package, no information is sent back to “headquarters”?
snapd will automatically install updates to snaps you’ve installed on your system. The process of checking for updates will reveal the list of installed snaps on your system.
I believe this contact with the store is used to generate the “weekly active devices” metric @Lin-Buo-Ren mentioned. If you pause refreshes, I don’t think there is any other contact that would keep you visible in the metrics provided to developers.