Classic snaps failing on Ubuntu 17.10

I’ve been attempting to install some classic snaps on a fully updated 17.10 system today.

I got this error with atom:

relocation error: /snap/atom/x1/lib/x86_64-linux-gnu/libpthread.so.0: symbol __libc_dl_error_tsd, version GLIBC_PRIVATE not defined in file libc.so.6 with link time reference

Teleconsole fails with the following:

terminated by signal SIGSEGV (Address boundary error)

Can anyone else confirm this issue with classic snaps?

1 Like

snapcraft has a blacklist of packages to never include in the build … while glibc is listed on this blacklist, seemingly the libpthread.so.0 file still ends up in the snap.

(this seems to particulay affect snaps that have been built natively under artful (which you should never do (until base snaps for all releases exist at least)) instead of using “cleanbuild”

Confirmed that teleconsole is broken on 17.10, but works on 16.04.

Teleconsole was built in a cleanbuild, on xenial, not on artful, and hasn’t changed in 9 months or more.

The Atom snap was built using snapcraft cleanbuild.

do either of teleconsole or atom ship libpthread.so.0 inside their snap (there is definitely a mix of libc binaries going on here)

teleconsole is a single go binary. No libraries.

Hmm that suggests that the core snap libc wrangles with the hosts libc here then, smells like a bug in snapd’s library path handling when using classic mode … and not like a bug in the snap itself at all …

The problem looks like the opposite: classic snaps are struggling with the glibc from the outer system. They should be using the glibc from the core instead.

@sergiusens @elopio Who’s the best person to look into this?

The answer is me, for @popey, is the teleconsole binary built with snapcraft or just “packaed” using it?

Answering my own question:

$ readelf -a /snap/teleconsole/current/teleconsole
...
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
...
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
...

So no linker loader set nor rpath.

The snap has no added libraries:

$ find /snap/teleconsole/current/
/snap/teleconsole/current/
/snap/teleconsole/current/command-teleconsole.wrapper
/snap/teleconsole/current/meta
/snap/teleconsole/current/meta/gui
/snap/teleconsole/current/meta/snap.yaml
/snap/teleconsole/current/teleconsole

Not sure what we can do on the snapcraft side here, @niemeyer do you have any ideas how to proceed?

@sergiusens I don’t think I understand the issue still. How’s that not the known issue of a classic snap not linking against the libraries in the core snap? Shouldn’t the wrapper have pointed towards the right libraries?

So @popey, this seems it has been built using one of the first releases of snapcraft that supported confinement: classic, what gave it away is that it has a wrapper telling it to use libraries from core. Given that you do not depend on any additional libraries aside from libc6 you should be fine by just rebuilding, it would start using libc6 from the host, which is technically incorrect, but it works.

This solves it for the case of teleconsole, here’s some fun with shell:

 sergiusens  ~  snap run --shell teleconsole
 sergiusens  ~  export LD_LIBRARY_PATH="/snap/core/current/lib:/snap/core/current/usr/lib:/snap/core/current/lib/x86_64-linux-gnu:/snap/core/current/usr/lib/x86_64-linux-gnu"

^C
ls
Segmentation fault
ls
Segmentation fault
ls
Segmentation fault
ls
Segmentation fault
^C
^C
exit
 sergiusens  ~  snap run --shell teleconsole
 sergiusens  ~  ls
apps     Documents  examples.desktop  Pictures  snap    Templates  Videos
Desktop  Downloads  Music             Public    source  venv
 sergiusens  ~  $SNAP/teleconsole
Starting local SSH server on localhost...
Requesting a disposable SSH proxy on teleconsole.com for sergiusens...
Checking status of the SSH tunnel...

Your Teleconsole ID: 6af2c431113685e079fd12151a67cb74fa7abda0
WebUI for this session: https://teleconsole.com/s/6af2c431113685e079fd12151a67cb74fa7abda0
To stop broadcasting, exit current shell by typing 'exit' or closing the window.
  lindon  sergiusens  ~  logout
Connection to localhost:41699 closed from the remote side
You have ended your session broadcast and the SSH tunnel is closed.

It seems that “exporting” the library paths is what is causing issues (maybe related to some ld and libc issue I would need to look into deeper). But after having the library paths exported, everything sort of stops working.

For the case of teleconsole at least, it seems to have been built with the first release of snapcraft supporting confinement: classic where we discovered that exporting these variables was not a good choice and removed it quickly in the following release making it a developer’s conscious choice (if built from source the rpaths would be doing this job of library searching). LP: #1657504 was the trigger of the whole conversation that led to that change.

@sergiusens Wait… can we please clarify what the problem is, and not recommend solutions which are technically incorrect? It should not use the libraries from the system, otherwise it will break elsewhere

Looking deeper so we can tell what the real problem is would be appreciated. :slight_smile:

Well, the only way to not have this problem is to build teleconsole from source which it hasn’t been.

@zyga-snapd can provide the list of things to do on the runtime side if rebuilding is not an option.

@sergiusens Once more, can we please not jump to conclusions? Moments ago you had no clear idea of what’s going on, and now you’re closing down on a single solution which involves rebuilding. If you know exactly what’s going on, can you please clarify? Otherwise, let’s please calm down and go after more details.

teleconsole in the store is indeed just a dump of an upstream tarball (yaml at http://bazaar.launchpad.net/~popey/+junk/teleconsole/view/head:/snap/snapcraft.yaml for reference).

I can re-do the yaml building from source, for sure.

Sorry about that, I was treating this a bit like chat :slight_smile:

My hypthesis is that this is related to a mix of libc’s being used. To explore on that this is what I’ve seen happen:

When running the command exposed by the teleconsole snap on an artful system, this LD_LIBRARY_PATH gets set

LD_LIBRARY_PATH="/snap/core/current/usr/lib:/snap/core/current/lib/x86_64-linux-gnu:/snap/core/current/usr/lib/x86_64-linux-gnu"

In that LD_LIBARY_PATH we have one specific path which includes libc related libraries: /snap/core/current/lib/x86_64-linux-gnu, such that, simulating behavior of snap run, this run of ldd fails:

LD_LIBRARY_PATH="/snap/core/current/lib:/snap/core/current/usr/lib:/snap/core/current/lib/x86_64-linux-gnu:/snap/core/current/usr/lib/x86_64-linux-gnu" ldd /snap/teleconsole/current/teleconsole 
Segmentation fault

And this one works:

LD_LIBRARY_PATH="/snap/core/current/lib:/snap/core/current/usr/lib:/snap/core/current/usr/lib/x86_64-linux-gnu" ldd /snap/teleconsole/current/teleconsole 	linux-vdso.so.1 =>  (0x00007ffeaa1f2000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f14a27b8000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f14a23d8000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f14a29d7000)

This was not a problem in the past for classic snaps given that from 16.04 until now (17.10) libc has not changed. Take into account I mention 16.04 and not 14.04 as, I am most certain due to the bug I mentioned earlier, it currently fails to run on 14.04 given that there is a different libc6.

Now in that last snippet where I run ldd which works I excluded one entry, which is the one that has all the libc libraries, and that is what gives me confidence in my response.

Now, when I said technically incorrect, I really meant it as we want to only use ld and libraries from either core or the snap when loading an elf file but today, there is only one way to do that (on the snapcraft side) and that is by compiling the binaries that are intended to be called.

Just running snapcraft again on teleconsole project should get rid of the command wrapper as described in the bug I mentioned and would make it use the system’s libc6 and for the specific case of teleconsole it would work as it only links against libc6 and there would be no mix of binaries/libraires and yes it is technically incorrect and yes this has been my grudge with classic confinement; I alway recommend building from source for classic confinement.

1 Like

that is not all you did! you also added a new entry for /snap/core/current/lib which was not there previously.

Just rebuilt the snap without compiling as suggested and yes, that fixed it.