Snapcraft not completing `override-build` script

After upgrading my snap from core18 to core20 [1] the override-build step is not completed anymore. However, there is no error message and snapcraft continues with staging and then proceeds to the next part [2]. The build script [3] executes the make command but terminates the script before the make install command. Somehow it does install something though and logging into the snapcraft VM shows files in /root/parts/erlang/install.

I don’t understand

  1. Why the build process is so different between core18 and core20
  2. Why the script is not proceeding past the make command although that command exeecuted without an error
  3. Why some files are installed in SNAP_PART_INSTALL although the make install command of the build script was not executed

Any help would be greatly appreciated.

Thanks!

References:

  1. https://github.com/nicolasbock/rabbitmq-server-snap/pull/52
  2. https://github.com/nicolasbock/rabbitmq-server-snap/runs/7982842565?check_suite_focus=true#step:3:5863
  3. https://github.com/nicolasbock/rabbitmq-server-snap/pull/52/files#diff-56759910381a014fecfd7556dd72ddd68c747d922a5b7df2044b9ce7c552f5f5R164

I don’t see any special reason why override-build wouldn’t complete on a core20 build compared to core18. If it gets past that step and proceeds to stage, the build scriptlet execution must have finished successfully. Maybe make install is running, but files are being installed to the wrong location?

1 Like

I agree, I can’t imagine a reason either :confused:

However, If I insert exit 1 in the script right after make the script is not terminated with an error. If I insert an echo statement right after the make instruction it is not printed in the logs.

Maybe make install is running, but files are being installed to the wrong location?

I am set -x in the beginning of that script and I would expect to see make install in the log if it was running.

For some reason that I have no idea what it could be, your override-build script changes haven’t been recognised during the github actions - note that the echo at the top of your override script is ALSO not being shown…

1 Like

The line

echo "Starting to build erlang at $(date)"

is executed [1]. Or where you thinking of another echo?

  1. https://github.com/nicolasbock/rabbitmq-server-snap/runs/8037483297?check_suite_focus=true#step:3:1670

I updated the PR and added an echo right after the make step. Let’s see if that is printed or not.

  1. https://github.com/nicolasbock/rabbitmq-server-snap/pull/52/files#diff-56759910381a014fecfd7556dd72ddd68c747d922a5b7df2044b9ce7c552f5f5R159

The line

./configure --prefix=/usr

is executed here [1], then

make -j ${SNAPCRAFT_PARALLEL_BUILD_COUNT}

is here [2], but

echo "Finished building erlang at $(date)"

is nowhere to be found. snapcraft continues with staging the erlang part [3] without having completed the override-build script.

  1. https://github.com/nicolasbock/rabbitmq-server-snap/pull/52/files#diff-56759910381a014fecfd7556dd72ddd68c747d922a5b7df2044b9ce7c552f5f5R156
  2. https://github.com/nicolasbock/rabbitmq-server-snap/runs/8039061273?check_suite_focus=true#step:3:2765
  3. https://github.com/nicolasbock/rabbitmq-server-snap/runs/8039061273?check_suite_focus=true#step:3:5554

I created https://bugs.launchpad.net/snapcraft/+bug/1987853

While I don’t understand why this fixed it, replacing the erlang part with this slightly modified version successfully compiles:

  erlang:
    plugin: autotools
    source: https://github.com/erlang/otp.git
    source-tag: OTP-24.3.4.2
    source-depth: 1
    build-packages:
      - autoconf
      - fop
      - libssl-dev
      - libncurses5-dev
      - libwxbase3.0-dev
      - libxml2-utils
      - xsltproc
    stage-packages:
      - openssl
      - ncurses-base
      - libncurses5
      - libtinfo5
    autotools-configure-parameters:
      - --prefix=/usr
    override-build: |
      echo "Starting to build erlang at $(date)"
      START_TIME=$(date +%s)

      snapcraftctl build

      sed --in-place \
        --expression 's:^\(\s*ROOTDIR\)="/usr/.*":\1=$(readlink --canonicalize $(dirname $0)/../lib/erlang):' \
        ${SNAPCRAFT_PART_INSTALL}/usr/bin/erl
      sed --in-place \
        --expression 's:^\(\s*ROOTDIR\)="/usr/.*":\1=$(readlink --canonicalize $(dirname $0)/../..):' \
        ${SNAPCRAFT_PART_INSTALL}/usr/lib/erlang/erts-*/bin/start
      sed --in-place \
        --expression 's:^\(\s*ROOTDIR\)="/usr/.*":\1=$(readlink --canonicalize $(dirname $0)/../..):' \
        ${SNAPCRAFT_PART_INSTALL}/usr/lib/erlang/erts-*/bin/erl
      sed --in-place \
        --expression 's:^\(\s*ROOTDIR\)="/usr/.*":\1=$(readlink --canonicalize $(dirname $0)/..):' \
        ${SNAPCRAFT_PART_INSTALL}/usr/lib/erlang/bin/start
      sed --in-place \
        --expression 's:^\(\s*ROOTDIR\)="/usr/.*":\1=$(readlink --canonicalize $(dirname $0)/..):' \
        ${SNAPCRAFT_PART_INSTALL}/usr/lib/erlang/bin/erl

      echo "Finished building erlang at $(date)"
      END_TIME=$(date +%s)
      echo "Total time: $((${END_TIME} - ${START_TIME})) seconds"
    stage:
      - etc
      - lib
      - usr

Edit: strangely that just moved the problem to the elixir part which, like the erlang part, also fails because the override-build stops at the make install line…

1 Like

This replacement for the elixir part gets that compiling and the rest of the snap builds thereafter:

  elixir:
    after:
      - erlang
    plugin: make
    source: https://github.com/elixir-lang/elixir.git
    source-tag: v1.13.4
    source-depth: 1
    make-parameters:
      - PREFIX=/usr
    override-build: |
      echo "Starting to build elixir at $(date)"
      START_TIME=$(date +%s)

      snapcraftctl build

      echo "Finished building elixir at $(date)"
      END_TIME=$(date +%s)
      echo "Total time: $((${END_TIME} - ${START_TIME})) seconds"
    stage:
      - usr

I’m at a loss to understand what is happening with the manually coded override-build scripts calling make directly rather than letting snapcraft do that bit, but at least this will get you going for now…

I think something in the erlang and elixir builds is causing the build script to exit with no error code… I tried putting a pipelined echo at the end of the make calls that aren’t returning and still nada, which indicates the shell completely disappears while make is running - probably caused by a command that make is spawning within, and I’m guessing it’s an erlang/elixir compilation-specific issue.

1 Like

Thank you @lucyllewy! This fixes it!

This issue is very mysterious to be honest. But the build system of Erlang is quite opaque to me and it’s very possible that it is doing something somewhat unorthodox causing snapcraft to exit prematurely. This does looks like a bug in snapcraft though but I am very happy to use your work around :grinning_face_with_smiling_eyes:.

Do you mind if I credit you for these suggestions in the pull request on GitHub?

1 Like

no, I don’t mind you giving me credit at all - I welcome it :slight_smile: thank you!

I’m still hitting this error when migrating from core18 to either core20 or core22, whether using snapcraft versions 7.5.5 or 8.0.5. I updated the bug report with details.

Your first example bash gradlew.. spawns a subshell, so any error from gradlew will indeed be swallowed by that shell, why are you doing this ? Just calling gradlew should be enough…

In your second example you call cd build while you are already in the $CRAFT_PART_BUILD directory (this is a given when using override-build) … though this should probably exit properly with a “no such file or directory” error, not sure…

The scriptlet calls bash gradlew ... just to avoid having to mark the script executable with chmod +x gradlew; ./gradlew .... In any case, calling gradlew directly without the subshell didn’t solve the problem. In both cases, the build completes successfully without errors.

The scriptlet changes the current directory to the build subdirectory of the build directory, /root/parts/jfx/build/build, where the build output files are stored in the further subdirectories sdk, jmods, and javadoc. Then it moves those directories to the $SNAPCRAFT_PART_INSTALL directory.

But the problem remains that it never even gets the chance to move over the build output files when using the core20 or core22 bases. I’m not sure how to get more information out of snapcraft to debug this further myself, and it has been working fine for years with the core18 base.

1 Like