We’ve been discussing the appstream ID field for the last three sprints and @apol is now blocked on that, so let’s see if we can reach some agreement about the best way to support the feature and then push it forward.
For some background, the main requirement is being able to identify an application across multiple possible sources, and potentially multiple packaging formats as well. The main contender for this right now is the appstream ID, which is a format-agnostic way to provide metadata and that has been adopted in many repositories.
Past discussions proposed having an ID field in an appstream.yaml file next to snap.yaml, or in an appstream: field inside snap.yaml. The first proposal assumed that there would be many fields which were appstream specific, but after many conversations all fields we discussed are first-class metadata that should eventually make their way into the format itself, so we don’t want to duplicate it into a second file. The second proposal already acknowledges that and proposes importing the data into snap.yaml and snapcraft.yaml proper, while still labeling it under an appstream domain. It also incorporated a lesson we learned later: we need to be able to to associate the ID with an application rather than the whole snap, as some of the metadata is application-specific (desktop files, etc).
So, here is a strawman that takes the second proposal one step further: how about having an id field in the application block itself:
The field would be optional, and would be automatically populated if information is imported from an appstream file in a way that can be automatically inferred. For example, we can compare binary name, package name, etc, with the application name in snapcraft.yaml. In addition to the application ID, similar logic might be used for the desktop file, icons, etc.
One open point: id is probably the wrong term for this data inside the snap ecosystem, as it will be misleading to have an identifier which is in fact optional and seldom used inside the system itself. But let’s leave that aside for the time being… we can find an ideal term before committing to the implementation.
A problem I can see in allowing free-form definition of an id-type field for appstream use is that there will be a conflict between a snap providing com.example.myapp and a deb or rpm or … providing the same id. Surely snaps will need to coexist with other packaging systems, so I would surmise that the most appropriate thing would be to generate an id-type field using the snap name and the snapcraft.io domain for any snaps delivered via the current store? e.g. io.snapcraft.<snap-name>. If we want to allow more fine-grained IDs then we could conceivably allow io.snapcraft.com.example.myapp? where the actual snap is id com.example.myapp, but hosted in the store at io.snapcraft.
I may be completely confused, however, so please feel free to call me a moron :-p
That “conflict” is actually the main reason for introducing the field, per details above:
In other words, when package management systems see multiple packages referring to the same identifier, they are able to tell that they are referring to the same underlying software.
I think this works, and I agree that we would need a more specific entry point than id as it is too generic.
About inferring the id, at first glance seems to be possible and given that to get to the command we would need to get to the desktop file we can also provision the desktop file into meta/gui as a bonus.
Okay, so how about common-id for the field name? That seems to reflect well the intention described above: it is an identifier, but the point is for it to be common across multiple packaging formats. And it feels less like a local identifier as well, which solves the issue described above for the id term.
From my perspective, we don’t use the unique IDs - we use the common package name to access AppStream metadata.
Thus it would appear to me the common name of a Snap would also be unique, with the .snap filename as the unique ID.
Additionally, how do we intend to make the overlay appstream data available? In theory each distribution can produce their own overlay, origins, etc, however it might actually make more sense to patch appstream-builder to support the .snap format, so this could support either a centralised (packageable) appstream data source for the store, or indeed a method for each distribution to produce that data (origin=snapd-ubuntu-one perhaps)
It’s not too complex to add support for these formats to AppStream, as the following Solus patches show (Yea I haven’t upstreamed them yet, bad Ikey :P):
The name of a snap is unique within the snap ecosystem itself, much like a deb package name is unique within Ubuntu, etc. But my understanding is that the only field in appstream that is supposed to bridge these multiple sources is the id field itself. This is what allows a single application page in the package managers to offer the installation of multiple formats, for instance.
The name and pkgname fields are custom for the particular application at hand, and may diverge from a snap to a deb, for instance.
@niemeyer yes, that is also my understanding of the appstream ID.
The fields in particular mean:
name - A human-readable name for this software. (i.e. how the app should be displayed to the user. This is essentially the title field for snaps).
pkgname - The name of the package which needs to be installed in order to make this component available on the system. (i.e. the mechanism by which the app is manipulated. This is the name field for snaps).
id - A short unique and usually lower-cases identifier for the component. (i.e. the means of uniquely identifying this app. Does not currently exist in the snap ecosystem).
We are getting ready to add the support server-side for this new field, and then it would be necessary to confirm the plan. Let me see if I can summarize what we have so far:
This new ID will be optional, for any entry in the apps section in snapcraft.yaml
Given that, one snap could have multiple of these IDs
This new ID per app will be added to snap.yaml as well
We still need to define a field name for the yaml file (@niemeyer, common-id?), and add the support in snapcraft (@sergiusens, is this in your roadmap?)
The store will return the information (a list of values, possibly empty; per revision) in the search/details responses.
We are working on the data collection aspect from appstream today and not specifically on propagating the id; once it is defined and the ground works complete, it should be a trivial task for us to expose this under apps.
The field should have ‘appstream’ in the name somewhere, since it’s a defined standard. In theory, an app could be tagged with other ID systems if they existed.
+1, given that this is specific for ‘appstream’, which we want to support, and AFAICT there isn’t any other ID system ATM for apps, I’d name it something like appstream_id (instead of a more generic/complex field or something requiring to guess the ID format).
The earlier suggestion of going with “common-id” still seems like the correct path.
The main rationale for not labeling it as “appstream-id” is that this is a snap specification, not an appstream specification, so the field name needs to convey the meaning inside a snap context. People can put this field to good use without ever touching an appstream file or ever using the ID in that context.
In the snap end, the field is also not required, and it does not sound sensible to enforce uniqueness in the store for at least two reasons. The most important one is that it would go against the very point of having the field in the first place. The single reason we are adding this is because one may have multiple alternatives for the same application. Why would we be okay with having multiple alternatives for the same app, but only if they come in from different stores? This may sound sensible in a world where everybody can have their own small repositories, but in the snap world everybody is collaborating around a single universal repository. We want these alternatives to be able to coexist there.
The second reason is that if we do that we’ll also have to worry about conflict handling, and what to do when that happens. That’s not as trivial as it may sound. We have infrastructure today for forcing the rename of a snap if it turns out a well known name is not being used in the most expected snap (e.g. mozilla would like to publish “firefox”). If we need to support that for the common ID, we’ll need to support a snap-declaration for it as well so that we can override a name if really necessary. That’s doable, but more work both to implement and to support the feature over time (forever) as it requires arbitration.
With all that said, the documentation of the field should still mention that this field has the same purpose of the appstream ID, and that when a piece of software has both an appstream ID and a snap common-id the two ought to match.
We should also check the syntax of the value to enforce the practice used in appstream IDs today, and document it too.
I’m confused as to what this common_id field contains, @niemeyer can you please show how it fits into the following example?
In the Ubuntu (.deb) archive we have an entry for GNOME Calculator like this (from /var/lib/apt/lists/archive.ubuntu.com_ubuntu_dists_bionic_main_dep11_Components-amd64.yml.gz):
Graphical software stores want to be able to recognise both packages are the same application, just packaged and delivered with different methods. They expect to be able to do this by matching the AppStream ID from both systems (this is done currently with .debs/.rpms and Flatpaks).
In the Snap case above, what will common_id contain?
FYI, server-side support for common-id is now available.
Next steps seems to be updating the reviewer tools to allow that field in snap.yaml (@jdstrand ), and updating snapcraft to handle it too (@sergiusens). Probably we want the reviewer tools updated first so we don’t reject revisions using the new field.