JMESPath in the snap tooling, we need your help

niemeyer · February 20, 2018, 8:04pm

Hello all,

For quite some time we’ve been mentioning the idea of introducing a richer way to query data in the snap tooling. One good example of that is snap configuration, which today enables setting and changing arbitrary options for snap to consult, adapt to, and change if necessary.

For example, this works today:

$ snap set core example.one='[1,2,3]' example.two='[4,5,6]'

$ snap get core example                                  
Key          Value
example.one  [1 2 3]
example.two  [4 5 6]

Internally these values are stored as a JSON document, and they can also be queried as such:

$ snap get -d core example
{
        "example": {
                "one": [
                        1,
                        2,
                        3
                ],
                "two": [
                        4,
                        5,
                        6
                ]
        }
}

The default support of the current get command is very limited, though, and only allows retrieving sub-documents via the typical dotted notation, as in example.one for instance. For more interesting use cases, we need to leverage the -d option and external tools, of which the most well known one nowadays is probably jq.

While that works, it creates a dependency on an external tool which may not necessarily be available, and being external also means the integration in snapd is weak, in the sense that we can’t expose such support in the API itself for all clients to use in a universal way. For example, if a snap needs to do just a single trivial query, it needs to depend on jq as well.

So for a while we’ve been thinking about how to solve that, and just recently I found an alternative which seems pretty good: JMESPath. This is a specification that has been around for several years, and also a set of libraries that conform to the specification. It tastes a bit like jq, but feels more limited, but also simpler and more consistent, which is good for our use case.

So, as a first step towards embedding this into our tooling, I would like to propose an experiment: I’ve just released a snap that leverages the Go jmespath implementation to work as a filter.

To get started, just do:

$ sudo snap install jmes

$ jmes -h
Usage: <command generating json or yaml> | jmes '<expression>'

The jmes command takes JSON or YAML in its standard input and
filters it using the JMESPath expression provided in the
command line. For example:

$ cat file.json | jmes 'items[].name'

See http://jmespath.org for details on the expressions.

Once installed, please try to use this for relevant tasks you may encounter, snap-related or not. The goal is learning more about the library and the language, and making sure we’re into the right track by the time we really adopt this. After we adopt, we cannot take it out without breaking people’s routines, which means we just can’t take it out.

So, please put this to some good use and let us know. If you are a jq user, try to use jmes instead when you can. And please feel free to ask any questions here, about JMESPath, or jmes, or its future use in snap. These exchanges will help guiding us in our upcoming decisions.

Here is a quick example operating on the JSON document above, flattening the two lists into one, and then sorting it:

$ snap get -d core example | jmes 'example.*[] | sort(@)'
[
    1,
    2,
    3,
    4,
    5,
    6
]

I’ve also started adding support for YAML in the tool, which means you can use the same logic to inspect snap.yaml and snapcraft.yaml. For instance:

$ cat /snap/lxd/current/meta/snap.yaml | jmes summary
LXD - the container lightervisor

niemeyer · February 24, 2018, 9:48pm

Version 2018.02.24 fixes the handling of short documents that do not contain any newlines.

After some experimentation, we can’t adopt JMESPath as-is due to some important shortcomings. Instead, we’ll need to come up with our own. I’m tempted to start with a small language, and grow with confidence about the syntax and semantics we put in place, so we don’t risk missing other issues that we’ll then need to preserve due to backwards compatibility.

With that said, let’s please continue testing the jmes tool so we know what else we want to get right.

Some of the initial issues I found:

Use of backticks

The expression language uses backticks, which is not friendly to shell usage. We don’t want those as part of expressions.

Awkward handling of numbers

The handling of numbers is surprisingly awkward. For example, consider the expression:

somelist[?n == `2`]

This will filter the list and only return the elements that have n == 2. It seems unnatural and unnecessary to use that escaping mechanism in those cases.

Single quoting vs. double quoting

Strings need to be expressed with either single quotes, or double quotes under backticks, which become a literal. But one does not evaluate to something equivalent to the other. Seems unnecessarily complex and inconsistent.

For example:

$ echo '{"foo": "bar"}' | jmes "foo == 'bar'"
true

$ echo '{"foo": "bar"}' | jmes 'foo == `"bar"`'
true

$ echo '{"foo": "bar"}' | jmes "'foo' == \`\"bar\"\`"
false

No reference to outer scopes

Consider this simple document:

{"n": 2, "list": [{"a": 1}, {"a": 2}, {"a": 3}]}

The expression list[?a == 2] returns a list with only the second element. But there’s no way to do the same thing while using the value of n which is at the top-level of the document.

More generally, after entering a scope there’s no way to reference values in an outer scope, neither in relative terms nor absolute (a reference to the root).

No variable assignment

Expressions need to be repeated every time they are needed.

Project and spec are unmaintained

Some of those issues were reported and have official proposals for three years.

chipaca · February 24, 2018, 10:42pm

I tried to use it today to grab, in a snap.yaml, the list of app names that have a non-empty completer attribute. I didn’t find how to do it, and in fact there didn’t seem to be much support for object keys (I don’t know if this is accurate, or the docs, or me and the docs).
In jq what I wanted would be .apps[] | select(has("completer")). And not because I remember it — I read the manpage to write this comment.

niemeyer · February 25, 2018, 2:41pm

There’s indeed an interesting limitation in JMESPath in that regard that we will want to see solved. The general problem is the lack of support for handling a map as being a list of homogeneous objects where the map keys are their names.

The way I’d like to see it working in our implementation is to have first-class support for that kind of filtering, interpreting it similar to an array. For example, using jmes notation it would be something similar to:

apps[?completer != ""]

This works fine today in jmes, but apps needs to be an array. The result in this case is the value of apps itself, which is an array, but filtered with the expression.

In our implementation I’d like to see the equivalent expression working fine when apps is a map: generate the map, only with the key/value pairs for which the expression matches.

I will update the list of issues above to keep it as a reference.

I don’t think that’s actually what you want. That result can be even more easily achieved in jmes. For comparison:

$ JSON='{"apps": {"foo": {"completer": true}}}'                                         

$ echo $JSON | jq '.apps[] | select(has("completer"))'                                  
{
  "completer": true
}

$ echo $JSON | jmes "apps.* | [?competer != '']"      
[
    {
        "completer": true
    }
]

To achieve what you want I think you need with_entries so it can first produce the entire document to then access its keys. It feels unnecessarily complex:

$ echo $JSON | jq '.apps | with_entries(select(.value | has("completer"))) | keys'
[
  "foo"
]