Core dump when built on incorrect libraries

I’m not sure it’s snap that dumps the core, normally you’d see a panic backtrace when such thing happens. Can you add set -x to getcert script see where it fails exactly?

coredumpctl should be able ot show you more about the coredumps.


that doesnt look so proper …

the bracket in the dash we a cut/paste bug, the actual dash file is fine.

I’ve just rebuilt using snapcraft stable and I’m still getting the core dump.
I’ve added the -x into the dash file but I’m still getting zero output except the core dump message.

I have to say that it feels like the executable that snapcraft generates is causing the core dump .

I ran
snapcraft try prime --devmode

Modified the getcert script so it just exits on line 2 and then ran:

and I’m still getting a core dump.

If I run coredumpctl it states:
No coredumps found.

so I’ve found the source of the segfault but I don’t understand why its causing a segfault.

snap try prime --devmode
then cd into the prime directory and run
this works fine.

getcert contents are:

set -x

If I however run

Then I get a seg fault.
Note: I changed the path to getcert in the wrapper so I could run it from the prime directory.

Change the shell to bash gives essentially the same seg fault.

I did a binary chop on the export statements in the command wrapper and found that removing the following export stopped the seg fault from occuring:

export LD_LIBRARY_PATH="$SNAP/snap/core/current/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH"

so Installed coredumpctl and now get the following:

 coredumpctl gdb 16180
           PID: 16180 (getcert)
           UID: 1000 (bsutton)
           GID: 1000 (bsutton)
        Signal: 11 (SEGV)
     Timestamp: Tue 2018-02-20 09:23:23 AEDT (1min 36s ago)
  Command Line: /bin/sh /snap/pi-gation/x1/getcert
    Executable: /bin/dash
 Control Group: /user.slice/user-1000.slice/user@1000.service/gnome-terminal-server.service
          Unit: user@1000.service
     User Unit: gnome-terminal-server.service
         Slice: user-1000.slice
     Owner UID: 1000 (bsutton)
       Boot ID: b56b2531b9284efcb63881f54bcd0712
    Machine ID: 8c272f6f20b44745b3e85ab0af9503f9
      Hostname: slayer-3
       Storage: /var/lib/systemd/coredump/core.getcert.1000.b56b2531b9284efcb63881f54bcd0712.16180.1519079003000000.lz4
       Message: Process 16180 (getcert) of user 1000 dumped core.
                Stack trace of thread 16180:
                #0  0x00007f80a96a3bf8 n/a (/lib/x86_64-linux-gnu/
                #1  0x00007f80a94800db n/a (/lib/x86_64-linux-gnu/
                #2  0x00007f80a9495632 n/a (/lib/x86_64-linux-gnu/

However even with the offending library removed from the wrapper running getcert via snap still segfaults:

How are you building the snap? Are you using cleanbuild or building straight out on the host? If on the host, what is this host you are using?

If I do a cleanbuild it works fine - no segfault.

I’m running ubuntu 17.10 using:

uname -a
Linux slayer-3 4.13.0-32-generic #35-Ubuntu SMP Thu Jan 25 09:13:46 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

I build via:
snapcraft prime
snap try prime --devmode

core - dump goes here.

I’ve just spooled up a 16.04 host and performed a build on the host and the app works fine.

Strongly suggests a 17.10 problem.

Not really, given you cant “just build” on 17.10 (this would get all the wrong dependencies).

Snap packages run on top of the core snap, the core snap is based on 16.04, libs and binaries you use in your snap need to be linked against the 16.04 set of binaries in the core snap.
If you do not build all your included dependencies from source you have to use snapcraft cleanbuild on any non 16.04 system (which will automatically build in a 16.04 container). If you don’t, you get libs linked against the wrong OS environment …

If there is a bug, it is in the documentation not making that clear to you …

Then there is a hugh whole in the doco as my impression was I could build
on any os that I can install snapcraft into.

This need to be a whole lot clearer.

you can … but need to use cleanbuild (which you should do anyway for a ton of other reasons)

I tried cleanbuid but the round trip times were too long as you have to do a full build each time (as I understand it).

During development I want to do a snap try so I can test/experiment and then only build parts that have changed.

snap development is slow enough as it is (too many times you need to clean a part to get it to build again after a snapcraft.yaml change) without requiring a full build on each iteration.

Or am I missing some technique that makes using lxd faster?

yes, you do, like you do when building without cleanbuild to avoid tainted results … the point is that you can very easily mess up your build host and your snap when not using it …

doing development is probably fine to do without cleanbuild (if you fully trust all the involved build systems to not trash your host), but your last step before building an actual snap you want to give to others or use in production should always be a cleanbuild.

I understood the issues around host pollution and requiring a final cleanbuild.

The question was more around is their a way of using lxd but still be able to do incremental builds and live changes to prime.


There are a few points above that would be worth clarifying. I’ve put these in a topic about common questions for cross-distro building.

Also, below there are some more comments on specific points in the conversation above.

@ogra That’s a very unhelpful way to put it. People must necessarily be able to “just build” on Ubuntu 17.10 and anywhere else they want, Ubuntu or otherwise. Anything preventing that is a serious bug that must be fixed.

@ogra That’s an unfortunate side effect, and @bsutton is absolutely right in being displeased. Snapcraft has an amazing amount of caching precisely to make development comfortable. It makes no sense for the same project to both do that and then claim that’s not a good thing. It is a good thing, and we need to fix cleanbuild.

Speaking of caching:

@bsutton The caching behavior of snapcraft is somewhat unfriendly still. Since we discussed this two years ago, it has improved somewhat, but it’s still not as good as it needs to be, and that bug remains open.

When an application implements caching, it needs to be almost entirely transparent, otherwise it exchanges user boredom time when redoing the operation again by user confusion and focused time when trying to understand why things are broken, which is a lot more expensive.

In practice, snapcraft needs to learn to invalidate its caching when things change, and redo the work automatically.

(thread cc @sergiusens @evan)

I did never say this isnt a bug :wink:

I did not say he is wrong being displeased …

(after all i’m just trying to help him to get proper endresults here …)

I know you are trying to help, and I’m trying to help you in doing so.

The comments were about the things you did say, and more specifically how it was said.

thanks for helping clear a few issues up.

option sounds like it might be the right way to go as it sounds like it delivers on the ‘implied’ snap promise of build anywhere, deploy anywhere.

I must say that as a newbie to snap its been a bit of a battle.
The documentation is rather sparse and spread out all over the place.

I should perhaps note (whilst trying not to upset anyone) that the core documentation needs to be edited by a native english speaker. There are large amounts of grammatically obscure sentences.

If I can give the team one message.

Make the doco better.

The one saving grace is that the community has been really helpful with all my questions.

So thanks!

Thanks for the comments @bsutton, and thanks for your continued interest despite these unpolished details we still have.

About the documentation, we are already working on a major refactoring of the documentation:

Per details there, all of that new documentation can be edited here in the forum, so if you ever find these obscure sentences in one of the pages and feel inclined to spend a few seconds on it, you can click on the link at the bottom of the page and fix it straight away. Or even just comment in the respective topic if you’d rather have someone else doing the editing.