Python classic snap confusion & frustration (core18, entry-point scripts, organize, stage, prime, override-build)


#1

In trying to fix the build.snapcraft.io snap for the charm snap, I was getting some frustrating and inconsistent results, so I decided I’d pare it down to a minimal test case. I started with what’s in the broken branch of https://github.com/johnsca/python-snap-core18-test

Essentially, it’s just a minimal Python app with a console_script entry-point called tst which prints some test output and also calls another entry-point script, tst-sub. Building that went without a hitch, but when calling it, I got the following:

Traceback (most recent call last):
  File "/snap/tst/x1/bin/tst-debug", line 6, in <module>
    from pkg_resources import load_entry_point
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 3088, in <module>
    @_call_aside
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 3072, in _call_aside
    f(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 3101, in _initialize_master_working_set
    working_set = WorkingSet._build_master()
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 574, in _build_master
    ws.require(__requires__)
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 892, in require
    needed = self.resolve(parse_requirements(requirements))
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 778, in resolve
    raise DistributionNotFound(req, requirers)
pkg_resources.DistributionNotFound: The 'tst==0.0.1' distribution was not found and is required by the application
success! (failed)

It seems that the Python environment is set up correctly for the initial entry-point (hence the “success!” bit at the bottom), but doesn’t carry through to the subprocess (hence the stack-trace). I’m rather unclear at this point why it would work that way, but I suppose it’s somewhat understandable given that it’s a classic snap. Inspecting the snap’s environment, it seems that we need to ensure that $SNAP/bin and $SNAP/usr/bin are prepended to $PATH. Even for classic snaps, this seems like it should be done by default, but it seems relatively easy to do with a wrapper:

#!/bin/bash

export PATH=$SNAP/bin:$SNAP/usr/bin:$SNAP/usr/local/bin:$PATH

exec "$@"

I put that in my repo as helpers/snap-env and proceeded to try to figure out how to get it into the snap.

I knew of the organize stanza, which is documented as: “In the key/value pair, the key represents the path of a file inside the part and the value represents how the file is going to be staged.” Seems straightforward enough, so I added this to my snapcraft.yaml:

    organize:
      helpers/snap-env: bin/snap-env

But that led to:

Failed to generate snap metadata: The specified command 'bin/snap-env $SNAP/bin/tst' defined in the app {'command': 'bin/snap-env $SNAP/bin/tst'} does not exist or is not executable.
Ensure that 'bin/snap-env $SNAP/bin/tst' is relative to the prime directory.

Thinking, ok maybe organize can only apply to already staged files, and reading the description of stage (“A list of files from to stage”), I figured I’d just add a stage stanza:

    stage:
      - helpers
    organize:
      helpers/snap-env: bin/snap-env

But that got similar results:

Failed to copy '/root/parts/tst/install/helpers': no such file or directory.
Check the path and try again.

It seems that for both stage and organize, “the part” actually means $SNAPCRAFT_PART_INSTALL. After some digging, I suppose this is mentioned in the docs, albeit somewhat indirectly, under the details of filesets. However, given that, I can’t see to see a way to specify a file from the source repo to stage. So I resorted to a build-override instead of the stage:

    override-build: |
      snapcraftctl build
      cp -R helpers $SNAPCRAFT_PART_INSTALL
    organize:
      helpers/snap-env: bin/snap-env

I still ended up with a build error:

Failed to generate snap metadata: The specified command 'bin/tst' defined in the app {'command': 'bin/tst'} does not exist or is not executable.
Ensure that 'bin/tst' is relative to the prime directory.

How is it that adding the bin/snap-env made bin/tst go away? Debugging the build, it seems that everything that I expected was in $SNAPCRAFT_PART_INSTALL/bin, including tst, tst-sub, and snap-env. However, only snap-env made it into prime/bin. That doesn’t make any sense to me at all.

Trying to explicitly mention the other two in override only got me this:

Failed to organize part 'tst': trying to organize file 'bin/tst' to 'bin/tst', but 'bin/tst' already exists

So, instead I tried adding them all to prime:

    override-build: |
      snapcraftctl build
      cp -R helpers $SNAPCRAFT_PART_INSTALL
    organize:
      helpers/snap-env: bin/snap-env
    prime:
      - bin/tst
      - bin/tst-sub
      - bin/snap-env

That built successfully. However, now even the initial entry-point fails:

Traceback (most recent call last):
  File "/snap/tst/x1/bin/tst", line 6, in <module>
    from pkg_resources import load_entry_point
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 3088, in <module>
    @_call_aside
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 3072, in _call_aside
    f(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 3101, in _initialize_master_working_set
    working_set = WorkingSet._build_master()
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 574, in _build_master
    ws.require(__requires__)
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 892, in require
    needed = self.resolve(parse_requirements(requirements))
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 778, in resolve
    raise DistributionNotFound(req, requirers)
pkg_resources.DistributionNotFound: The 'tst==0.0.1' distribution was not found and is required by the application

Inspecting the snap, it seems that adding organize essentially made it drop any default includes and only include the files explicitly mentioned. I couldn’t find mention of this anywhere in the docs, and it’s very confusing and unexpected. From some additional testing, it seems that stage and prime also trigger this behavior, but only with includes; excludes work as expected without breaking the default set of includes. (Although I wonder if they subtly break a default set of excludes?)

I should also note that figuring that out from the stack-trace was quite non-obvious and it took me some time to understand what had happened.

In the end, I realized that I could do the work of organize in the override-build instead and eschew any use of organize, stage, or prime:

    override-build: |
      snapcraftctl build
      cp helpers/snap-env $SNAPCRAFT_PART_INSTALL/bin

This got me a snap working as I expected, but the need for both the wrapper and the override-build seem like warts that really should be addressed, and the behavior of organize, stage, and prime seem broken.

For reference, what I finally ended up with is in the working branch of the repo, with the changes required available as a PR.

I’d also like to point out that debugging this was quite time-consuming, because I had to run snapcraft clean between every build attempt, since even with a successful build and no changes to the code or snapcraft.yaml, running snapcraft a second time would inevitably result in:

ERROR: You must give at least one requirement to install (maybe you meant "pip install /root/parts/tst/python-packages"?)
Failed to run '/root/stage/usr/bin/python3 -m pip install --user --no-compile --no-index --find-links /root/parts/tst/python-packages --upgrade --no-deps': Exited with code 1.

I should also note that the above applies to core18. I recall seeing different behavior overall prior to adopting the base: core18 directive, but I haven’t had a chance to go back and test what the differences are. It would be nice if the changes were called out somewhere, but I couldn’t find a good summary.

Anyway, I hope this can help anyone else running into this sort of issue, and I welcome further discussion. I would quite like to see this made less confusing and easier to debug in the future.


#2

Personally, I would prefer getting rid of the organize keyword completely. There are a few bugs open against it that require a complete time consuming refactor to get going and considering how widely used it is, we have punted that to the bottom of the queue in favor of other bug fixes and features.

Now on to something that raises eyebrows,

will make anything you invoke through the shell to use the things in these paths, is that the effect you want?

An easy illustration of this is asciinema. If it had this env, when doing a recording and calling out python3 you will most certainly be reaching /snap/asciinema/current/usr/bin/python3 and not /usr/bin/python3.

That is also the reason it is not the default behavior.