Support for custom license text


#1

We currently support SPDX license [1] declaration that fits many Free Software apps well (because picking a license is easier than making one up). To be exhaustive we should add support for non-standard licenses across the snapcraft->store->snapd stack.

As a way to boostrap the discussion I’d like to propse that we do the following:

meta/snap.yaml and meta/LICENSE

The existing license field should accept one more value (apart from valid SPDX expressions), custom. This value is not a valid SPDX license but would be used as a trigger for the custom license text behavior. Once defined tools would look for the file meta/LICENSE and ensure that it is non-empty.

Using this we could synthesize a valid SPDX description for custom licenses [2] that is not very pretty but should open the door for interoperability with other software that only speaks SPDX.

snapcraft publishing behavior

Snapcraft would have to validate license: custom and ensure that meta/LICENSE is non-empty. In addition meta/LICENSE must not be present if one of the well-known licenses is used.

store side behavior

The store needs to collect the meta/LICENSE file and provide access to the text through one of the existing APIs when asked about that field specifically.

snapd side behavior

(this is hand-wave territory now, please provide ideas)

Snapd could show existing license in snap info, perhaps offering an option to show the custom licenses behind a new sub-command snap license <snapname> (which would work for both well-know SPDX and custom licenses).

[1] https://spdx.org/licenses/
[2] https://spdx.org/spdx-specification-21-web-version#h.1v1yuxt


Snap license metadata
#2

If snapcraft does the custom license -> valid SPDX translation, then ISTM that neither snapd nor store need to change drastically - from the store’s and snapd’s point of view all snaps have valid SPDX licenses. Niceties can be added to the UX for snapd to de-uglify the SPDX stanza that allows for the custom license. Store APIs that already return license information can continue to do so etc. the burden shifts to the client to, as necessary, de-uglify SPDX.


#3

I think that unless I misunderstand something there has to be a new way to convey the full license text. Otherwise the synthesized SPDX is useless.


#4

Why new? SPDX already can represent custom license text, with the full text - as per SPDX 6.2

Perhaps I’m misunderstanding?


#5

How would that look like on command line? Would we get a super-long value of the license field?


#6

As you allude to, I think we might still need CLI UX to better expose the license. As far as I can tell now (2.30) there’s no way to see the license for a snap? At least it’s not in snap info <snap>.

My concern is the suggestion to have another different way of representing a license, when we can already do it using SPDX. Note that I’m not suggesting SPDX be exposed verbatim to the user, we can, and should apply human friendly labels, whilst still allowing access to the full license.


#7

As we discussed today, this is the basic idea, but I still would like to consider a bit what we want to highlight in terms of distinctions between free/open licenses, or not.

That’s one example that needs consideration, for instance. SPDX is a license expression which includes known labels in its registry. It’s not specific to blessed free/open licenses.

But then, what does it mean to be blessed? Many widely used licenses are probably not blessed either. If we limit ourselves to the opinion of a central registry which can’t do any better than point to organizations such as opensource.org and the FSF, we’ll likely end up with a very high bar of what constitutes open/free software. At the same time, not doing so means we’re on the line for defining that.

Another aspect to consider is that “custom” puts the burden of understanding what the text is about onto the user. That’s not reasonable. SPDX solves that in part, and includes a mechanism through which people can submit new licenses, but again it delegates the actual blessing to those institutions.

These aspects need consideration, and is why I didn’t put the proposal you wrote down above forward before.


#8

I think the FOSS license bit is very muddy as various organizations have different views there. Is the Debian “main” approved set of licenses the same as for Fedora, OpenSUSE and Ubuntu? I don’t know but I wouldn’t be surprised if it isn’t.


#9

As I explained in our conversation today, that solves nothing, in an ugly way.

SPDX simply says “add a LicenseRef-<your ID> label which includes a unique reference”. Who defines that unique reference? Where is the text held? Who guarantees different institutions won’t present conflicting IDs to different text? And so on.


#10

Yes, licensing is not a clean logical area. But we cannot brush it off.


#11

No disagreement here. Here’s a quick idea: we could send a list of approved licenses to the store. The set would differ per distribution (default) and would be user controlled. This way we could make it very strict or totally open and everything in between with one mechanism.


#12

This is from the SPDX standard, section 6.1. I think @sparkiegeek refers to the provision in section 6.2: “Purpose: Provide a copy of the actual text of the license reference extracted from the package or file that is associated with the License Identifier to aid in future analysis.” and the format is “Data Format: free form text field that may span multiple lines.”, delimited by <text></text>. The presence of those delimiters allows some level of validation which our SPDX validators should be able to handle; i.e. it’s not just random junk content.

I take this to mean the license field could store the verbatim text of the license, so there’s no need to fiddle with the LicenseRef thing.

Snapcraft could, for instance, in the absence of a license: field in the snapcraft.yaml, look for meta/LICENSE and send that as the content to be stored per this provision.


#13

The fact we can include a full license text inside an arbitrary file doesn’t help us in any way. We can as well include the full text inside a file that has no further syntax on it.


#14

It does help the store.

We won’t have to have a separate field for storing something else (the custom license) and figure out how we distinguish between custom licenses and SPDX etc.


#15

How is it easier to dig into an XML file to figure out which licenses are there and whether you should use it or not, instead of reading a pure text file with some text in it?


#16

Okay, here is a proposal.

To support custom licenses, we can extend the basic syntax of SPDX with expressions that look like:

license: unusual/<id>

Those licenses must be carried by the snap under meta/license/<id>.txt, in pure text format. The ID must match “[a-zA-Z0-9-.]+”.

Here is some of the rationale behind the proposal:

  1. That representation allows multiple licenses to be conveyed, using richer SPDX expressions.
  2. Something unusual means not used often. If it was usual, SPDX should cover with a first class identifier.
  3. Something unusual tends to be an eye-opener.
  4. Publishers will naturally prefer not to label their license as unusual, if possible.
  5. At the same time, “unusual” is not “bad” either. It simply means it must be looked into.
  6. It blurs the line between proprietary and unknown open source licenses. This reflects reality: if the license has custom unknown terms that weren’t vetoed by a recognized entity, it may be a trap to account it as free software.
  7. It gives a choice for proprietary software to express their terms. People often consider “proprietary” as simply “not free software”, but both free software and proprietary software have variations in their licensing terms.

Then, taking this chance, we’ve agreed to label missing licenses as:

license: unknown

Per the plan described in the respective topic, at some point in the near future snaps not defining a license will stop being accepted by the store. Before that happens we’ll wait some time after the feature is fully implemented so everybody has a chance to amend their metadata.


#17

I’ve replaced the colon in the expression with a slash, to avoid the slight visual glitch of having the two colons (license: and unusual:).


#18

Suggestion, allow dot in the name of the license. I suspect many licenses will be versioned, say “foo-2.1” and this seems reasonable to support.


#19

Sounds reasonable. Updated the expression.


#20

@roadmr It looks like people are still seeing the license as “proprietary” when it is actually “unknown” (because we really don’t know). Can we fix that soonish?