EGL-using snaps on impish seem to be broken when using the Nvidia proprietary driver

laney · July 27, 2021, 5:00pm

I just noticed this on my impish host:

laney@raleigh> snap run flokk-contacts
/snap/flokk-contacts/11/flokk-contacts: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.33' not found (required by /var/lib/snapd/lib/gl/libEGL.so.1)

and the snap doesn’t run.

The problem seems to be that libEGL.so.1 is bind mounted into the snap’s namespace under /var/lib/snapd/lib/gl/, but not its dependencies. If you see:

laney@raleigh> apt show libegl1 | grep Depends
Depends: libc6 (>= 2.33), libglvnd0 (= 1.3.3-1), libegl-mesa0

It’s clear that we’ve violated this by bind mounting libEGL.so.1 alone.

Any ideas?

(BTW I’m using the proprietary Nvidia 470 drivers in case that matters.)

Saviq · July 27, 2021, 5:07pm

It does matter. Only Nvidia drivers are exposed this way into the snap namespace. The userspace for open source drivers comes along with your snap.

laney · July 27, 2021, 5:08pm

I’ll edit the subject to add that caveat, thanks!

ijohnson · July 27, 2021, 10:15pm

For the snap in question can you do this:

snap run --shell flokk-contacts -c 'ls -lah /var/lib/snapd/lib/gl'

This sounds like a new dependency, since I don’t see the dynamic library dependencies for libglvnd0 on Hirsute:

$ snap run --shell flokk-contacts -c 'ldd /var/lib/snapd/lib/gl/libEGL.so.1'
	linux-vdso.so.1 (0x00007ffe13171000)
	/snap/flokk-contacts/11/lib/bindtextdomain.so (0x00007f62224e2000)
	libGLdispatch.so.0 => /var/lib/snapd/lib/gl/libGLdispatch.so.0 (0x00007f622283e000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f62222de000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f6221eed000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f62226e5000)

If indeed there are new dependencies from the host we need to be bind mounting as well, we can certainly add those to the list, it would be good to know what the full list is

laney · July 28, 2021, 8:10am

Ya, here you go

total 4.0K
drwxrwxrwt 3 root laney 1.5K Jul 27 16:34 .
drwxr-xr-x 6 root root  4.0K Sep 26  2018 ..
lrwxrwxrwx 1 root laney   12 Jul 27 16:34 libcuda.so -> libcuda.so.1
lrwxrwxrwx 1 root laney   20 Jul 27 16:34 libcuda.so.1 -> libcuda.so.470.42.01
lrwxrwxrwx 1 root laney   67 Jul 27 16:34 libcuda.so.470.42.01 -> /var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libcuda.so.470.42.01
lrwxrwxrwx 1 root laney   26 Jul 27 16:34 libEGL_nvidia.so.0 -> libEGL_nvidia.so.470.42.01
lrwxrwxrwx 1 root laney   73 Jul 27 16:34 libEGL_nvidia.so.470.42.01 -> /var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.470.42.01
lrwxrwxrwx 1 root laney   11 Jul 27 16:34 libEGL.so -> libEGL.so.1
lrwxrwxrwx 1 root laney   15 Jul 27 16:34 libEGL.so.1 -> libEGL.so.1.1.0
lrwxrwxrwx 1 root laney   62 Jul 27 16:34 libEGL.so.1.1.0 -> /var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libEGL.so.1.1.0
lrwxrwxrwx 1 root laney   18 Jul 27 16:34 libGLdispatch.so -> libGLdispatch.so.0
lrwxrwxrwx 1 root laney   22 Jul 27 16:34 libGLdispatch.so.0 -> libGLdispatch.so.0.0.0
lrwxrwxrwx 1 root laney   69 Jul 27 16:34 libGLdispatch.so.0.0.0 -> /var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libGLdispatch.so.0.0.0
lrwxrwxrwx 1 root laney   32 Jul 27 16:34 libGLESv1_CM_nvidia.so.1 -> libGLESv1_CM_nvidia.so.470.42.01
lrwxrwxrwx 1 root laney   79 Jul 27 16:34 libGLESv1_CM_nvidia.so.470.42.01 -> /var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.470.42.01
lrwxrwxrwx 1 root laney   17 Jul 27 16:34 libGLESv1_CM.so -> libGLESv1_CM.so.1
lrwxrwxrwx 1 root laney   21 Jul 27 16:34 libGLESv1_CM.so.1 -> libGLESv1_CM.so.1.2.0
lrwxrwxrwx 1 root laney   68 Jul 27 16:34 libGLESv1_CM.so.1.2.0 -> /var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libGLESv1_CM.so.1.2.0
lrwxrwxrwx 1 root laney   29 Jul 27 16:34 libGLESv2_nvidia.so.2 -> libGLESv2_nvidia.so.470.42.01
lrwxrwxrwx 1 root laney   76 Jul 27 16:34 libGLESv2_nvidia.so.470.42.01 -> /var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.470.42.01
lrwxrwxrwx 1 root laney   14 Jul 27 16:34 libGLESv2.so -> libGLESv2.so.2
lrwxrwxrwx 1 root laney   18 Jul 27 16:34 libGLESv2.so.2 -> libGLESv2.so.2.1.0
lrwxrwxrwx 1 root laney   65 Jul 27 16:34 libGLESv2.so.2.1.0 -> /var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libGLESv2.so.2.1.0
lrwxrwxrwx 1 root laney   10 Jul 27 16:34 libGL.so -> libGL.so.1
lrwxrwxrwx 1 root laney   14 Jul 27 16:34 libGL.so.1 -> libGL.so.1.7.0
lrwxrwxrwx 1 root laney   61 Jul 27 16:34 libGL.so.1.7.0 -> /var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libGL.so.1.7.0
lrwxrwxrwx 1 root laney   15 Jul 27 16:34 libGLU.so -> libGLU.so.1.3.1
lrwxrwxrwx 1 root laney   15 Jul 27 16:34 libGLU.so.1 -> libGLU.so.1.3.1
lrwxrwxrwx 1 root laney   62 Jul 27 16:34 libGLU.so.1.3.1 -> /var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libGLU.so.1.3.1
lrwxrwxrwx 1 root laney   16 Jul 27 16:34 libGLX_indirect.so.0 -> libGLX_mesa.so.0
lrwxrwxrwx 1 root laney   26 Jul 27 16:34 libGLX_nvidia.so.0 -> libGLX_nvidia.so.470.42.01
lrwxrwxrwx 1 root laney   73 Jul 27 16:34 libGLX_nvidia.so.470.42.01 -> /var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.470.42.01
lrwxrwxrwx 1 root laney   11 Jul 27 16:34 libGLX.so -> libGLX.so.0
lrwxrwxrwx 1 root laney   15 Jul 27 16:34 libGLX.so.0 -> libGLX.so.0.0.0
lrwxrwxrwx 1 root laney   62 Jul 27 16:34 libGLX.so.0.0.0 -> /var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libGLX.so.0.0.0
lrwxrwxrwx 1 root laney   15 Jul 27 16:34 libnvcuvid.so -> libnvcuvid.so.1
lrwxrwxrwx 1 root laney   23 Jul 27 16:34 libnvcuvid.so.1 -> libnvcuvid.so.470.42.01
lrwxrwxrwx 1 root laney   70 Jul 27 16:34 libnvcuvid.so.470.42.01 -> /var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libnvcuvid.so.470.42.01
lrwxrwxrwx 1 root laney   18 Jul 27 16:34 libnvidia-cfg.so -> libnvidia-cfg.so.1
lrwxrwxrwx 1 root laney   26 Jul 27 16:34 libnvidia-cfg.so.1 -> libnvidia-cfg.so.470.42.01
lrwxrwxrwx 1 root laney   73 Jul 27 16:34 libnvidia-cfg.so.470.42.01 -> /var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.470.42.01
lrwxrwxrwx 1 root laney   78 Jul 27 16:34 libnvidia-compiler.so.470.42.01 -> /var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.470.42.01
lrwxrwxrwx 1 root laney   77 Jul 27 16:34 libnvidia-eglcore.so.470.42.01 -> /var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.470.42.01
lrwxrwxrwx 1 root laney   21 Jul 27 16:34 libnvidia-encode.so -> libnvidia-encode.so.1
lrwxrwxrwx 1 root laney   29 Jul 27 16:34 libnvidia-encode.so.1 -> libnvidia-encode.so.470.42.01
lrwxrwxrwx 1 root laney   76 Jul 27 16:34 libnvidia-encode.so.470.42.01 -> /var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libnvidia-encode.so.470.42.01
lrwxrwxrwx 1 root laney   18 Jul 27 16:34 libnvidia-fbc.so -> libnvidia-fbc.so.1
lrwxrwxrwx 1 root laney   26 Jul 27 16:34 libnvidia-fbc.so.1 -> libnvidia-fbc.so.470.42.01
lrwxrwxrwx 1 root laney   73 Jul 27 16:34 libnvidia-fbc.so.470.42.01 -> /var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.470.42.01
lrwxrwxrwx 1 root laney   76 Jul 27 16:34 libnvidia-glcore.so.470.42.01 -> /var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.470.42.01
lrwxrwxrwx 1 root laney   74 Jul 27 16:34 libnvidia-glsi.so.470.42.01 -> /var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.470.42.01
lrwxrwxrwx 1 root laney   79 Jul 27 16:34 libnvidia-glvkspirv.so.470.42.01 -> /var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.470.42.01
lrwxrwxrwx 1 root laney   18 Jul 27 16:34 libnvidia-ifr.so -> libnvidia-ifr.so.1
lrwxrwxrwx 1 root laney   26 Jul 27 16:34 libnvidia-ifr.so.1 -> libnvidia-ifr.so.470.42.01
lrwxrwxrwx 1 root laney   73 Jul 27 16:34 libnvidia-ifr.so.470.42.01 -> /var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libnvidia-ifr.so.470.42.01
lrwxrwxrwx 1 root laney   17 Jul 27 16:34 libnvidia-ml.so -> libnvidia-ml.so.1
lrwxrwxrwx 1 root laney   25 Jul 27 16:34 libnvidia-ml.so.1 -> libnvidia-ml.so.470.42.01
lrwxrwxrwx 1 root laney   72 Jul 27 16:34 libnvidia-ml.so.470.42.01 -> /var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.470.42.01
lrwxrwxrwx 1 root laney   29 Jul 27 16:34 libnvidia-opencl.so.1 -> libnvidia-opencl.so.470.42.01
lrwxrwxrwx 1 root laney   76 Jul 27 16:34 libnvidia-opencl.so.470.42.01 -> /var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.470.42.01
lrwxrwxrwx 1 root laney   26 Jul 27 16:34 libnvidia-opticalflow.so -> libnvidia-opticalflow.so.1
lrwxrwxrwx 1 root laney   34 Jul 27 16:34 libnvidia-opticalflow.so.1 -> libnvidia-opticalflow.so.470.42.01
lrwxrwxrwx 1 root laney   81 Jul 27 16:34 libnvidia-opticalflow.so.470.42.01 -> /var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.470.42.01
lrwxrwxrwx 1 root laney   29 Jul 27 16:34 libnvidia-ptxjitcompiler.so -> libnvidia-ptxjitcompiler.so.1
lrwxrwxrwx 1 root laney   37 Jul 27 16:34 libnvidia-ptxjitcompiler.so.1 -> libnvidia-ptxjitcompiler.so.470.42.01
lrwxrwxrwx 1 root laney   84 Jul 27 16:34 libnvidia-ptxjitcompiler.so.470.42.01 -> /var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.470.42.01
lrwxrwxrwx 1 root laney   76 Jul 27 16:34 libnvidia-rtcore.so.470.42.01 -> /var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libnvidia-rtcore.so.470.42.01
lrwxrwxrwx 1 root laney   73 Jul 27 16:34 libnvidia-tls.so.470.42.01 -> /var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libnvidia-tls.so.470.42.01
lrwxrwxrwx 1 root laney   23 Jul 27 16:34 libnvoptix.so.1 -> libnvoptix.so.470.42.01
lrwxrwxrwx 1 root laney   70 Jul 27 16:34 libnvoptix.so.470.42.01 -> /var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libnvoptix.so.470.42.01
lrwxrwxrwx 1 root laney   14 Jul 27 16:34 libOpenGL.so -> libOpenGL.so.0
lrwxrwxrwx 1 root laney   18 Jul 27 16:34 libOpenGL.so.0 -> libOpenGL.so.0.0.0
lrwxrwxrwx 1 root laney   65 Jul 27 16:34 libOpenGL.so.0.0.0 -> /var/lib/snapd/hostfs/usr/lib/x86_64-linux-gnu/libOpenGL.so.0.0.0
drwxr-xr-x 2 root root   100 Jul 27 16:34 vdpau

/var/lib/snapd/lib/gl/libEGL.so.1: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.33' not found (required by /var/lib/snapd/lib/gl/libEGL.so.1)
	linux-vdso.so.1 (0x00007ffd26127000)
	/snap/flokk-contacts/11/lib/bindtextdomain.so (0x00007f8ed18b7000)
	libGLdispatch.so.0 => /var/lib/snapd/lib/gl/libGLdispatch.so.0 (0x00007f8ed1c17000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f8ed16b3000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f8ed12c2000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f8ed1aba000)

it’s the libc itself which we’d need to bring over for this particular case. Is that safe?

ijohnson · July 28, 2021, 2:09pm

Ah yes sorry I had missed that detail. For libc6 itself, no we definitely do not want to bring in libc6 from the host into the mount namespace.

Thinking back about our solution to this problem, I’m actually a bit surprised we didn’t run into this problem sooner, since I’m not sure that we can continue to bind mount drivers/graphics libraries from the host into the namespace if said drivers/libraries will have dependencies which cannot be satisfied by the base snap or the snap itself. I will discuss with the team, but this is quite unfortunate…

ijohnson · July 28, 2021, 4:24pm

@laney can you check if the nvidia libs from your host that are being bind mounted into the host also have this dependency on the newer libc6? In a dev impish container it looks like libEGL.so only has a symbol dependency on fstat64 from the newer libc6:

root@794c36ffdc15:/# objdump -T /usr/lib/x86_64-linux-gnu/libEGL.so.1.1.0 | grep GLIBC_2.33
0000000000000000      DF *UND*	0000000000000000  GLIBC_2.33  fstat64
root@794c36ffdc15:/# objdump -T /usr/lib/x86_64-linux-gnu/libEGL.so.1.1.0 | grep -Po 'GLIBC_\K[0-9.]+' | sort --version-sort | uniq
2.2.5
2.3
2.3.4
2.4
2.7
2.14
2.33
root@794c36ffdc15:/# objdump -T /usr/lib/x86_64-linux-gnu/libEGL.so.1.1.0 | grep GLIBC_2.33
0000000000000000      DF *UND*	0000000000000000  GLIBC_2.33  fstat64

whereas the version I have on Hirsute has only 2.14 as the most recent libc6 dependency:

$ objdump -T /usr/lib/x86_64-linux-gnu/libEGL.so.1.1.0 | grep -Po 'GLIBC_\K[0-9.]+' | sort --version-sort | uniq
2.2.5
2.3
2.3.4
2.4
2.7
2.14

laney · July 28, 2021, 5:03pm

Hi, yes @ijohnson, I can confirm exactly the same output as you. I’d guess that it picks up this dependency when build against >= 2.33. I rebuilt the impish version on focal and it got Depends: libc6 (>= 2.14) instead of 2.33 there, which seems to back that up.

I guess the general problem is that we’re bind mounting but not necessarily bringing over the deps (in an ldd sense) of the things we are bind mounting. It feels like this approach is kind of dodgy in the situation we’re in. We could maybe mitigate it by always building the stuff for the earliest supported series and then binary-copying upwards. But perhaps a re-evaluation is required and something where we bring in a copy of the same drivers from the store, built against the right toolchain, would be more robust?

ijohnson · July 28, 2021, 6:56pm

Yes I agree this approach is dodgy indeed in light of this problem, I wasn’t originally involved with the decision to do it this way, but I think the reason it was deemed okay to do this was that there was not any new dependencies introduced like this and so “it just worked”. That of course is not a justification alone for doing it, but it probably made more sense at the time than just not supporting it

ijohnson · July 28, 2021, 8:32pm

So I also had a look at the nvidia libraries for at driver version 460 and 470 (via installing libnvidia-gl-460 and then uninstalling that one and installing libnvidia-gl-470-server in my impish container), and all those libraries from NVIDIA that we repackage as debian packages (IIRC this is how it works) files seem to be okay, none of them have new libc6 dependencies. So it’s just the things we actually build in the archive, which suggests the sky may not be falling right at this minute.

460:

root@407cd95c6adc:/# for f in $(ls /usr/lib/x86_64-linux-gnu/lib*nvidia*); do echo "$f:"; objdump -T $f | grep -Po 'GLIBC_\K[0-9.]+' | sort --version-sort | uniq; done
/usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.0:
2.2.5
2.3.2
2.3.3
2.4
/usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.460.80:
2.2.5
2.3.2
2.3.3
2.4
/usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.1:
2.2.5
/usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.460.80:
2.2.5
/usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.2:
2.2.5
/usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.460.80:
2.2.5
/usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.0:
2.2.5
/usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.460.80:
2.2.5
/usr/lib/x86_64-linux-gnu/libnvidia-cbl.so.460.80:
2.2.5
/usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.460.80:
2.2.5
2.3
2.3.3
2.7
2.9
2.10
/usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.460.80:
2.2.5
2.3
2.3.3
2.7
2.9
2.10
/usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.460.80:
2.2.5
2.3
2.7
/usr/lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.460.80:
2.2.5
2.3
2.3.2
2.7
/usr/lib/x86_64-linux-gnu/libnvidia-ngx.so.1:
2.2.5
2.3
2.3.4
2.7
/usr/lib/x86_64-linux-gnu/libnvidia-ngx.so.460.80:
2.2.5
2.3
2.3.4
2.7
/usr/lib/x86_64-linux-gnu/libnvidia-rtcore.so.460.80:
2.2.5
2.3
2.3.2
2.7
/usr/lib/x86_64-linux-gnu/libnvidia-tls.so.460.80:
2.2.5

470:

for f in $(ls /usr/lib/x86_64-linux-gnu/lib*nvidia*); do echo "$f:"; objdump -T $f | grep -Po 'GLIBC_\K[0-9.]+' | sort --version-sort | uniq; done
/usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.0:
2.2.5
2.3.2
2.3.3
2.4
/usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.470.57.02:
2.2.5
2.3.2
2.3.3
2.4
/usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.1:
2.2.5
/usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.470.57.02:
2.2.5
/usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.2:
2.2.5
/usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.470.57.02:
2.2.5
/usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.0:
2.2.5
/usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.470.57.02:
2.2.5
/usr/lib/x86_64-linux-gnu/libnvidia-cbl.so.470.57.02:
2.2.5
2.3
2.3.2
2.3.3
2.4
2.7
/usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.470.57.02:
2.2.5
2.3
2.3.3
2.7
2.9
2.10
/usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.470.57.02:
2.2.5
2.3
2.3.3
2.7
2.9
2.10
/usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.470.57.02:
2.2.5
2.3
2.7
/usr/lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.470.57.02:
2.2.5
2.3
2.3.2
2.7
/usr/lib/x86_64-linux-gnu/libnvidia-ngx.so.1:
2.2.5
2.3
2.3.4
2.7
/usr/lib/x86_64-linux-gnu/libnvidia-ngx.so.470.57.02:
2.2.5
2.3
2.3.4
2.7
/usr/lib/x86_64-linux-gnu/libnvidia-rtcore.so.470.57.02:
2.2.5
2.3
2.3.2
2.7
/usr/lib/x86_64-linux-gnu/libnvidia-tls.so.470.57.02:
2.2.5

Saviq · July 29, 2021, 11:08am

FWIW we worked out something that allowed us to separate the graphics userspace into a separate snap:

It is a basis for resolving the issue at hand, but Nvidia has one more complicating bit, and that is it’s bound to the kernel module. So a nvidia-core20 snap would need to carry all the supported versions of the userspace and mount the appropriate one, or for there to be tracks (¯\_(ツ)_/¯) for all the supported versions. Any case, Snapd has to learn how to deal with that.

jamesh · August 5, 2021, 2:27am

This was @zyga’s thoughts about this kind of problem from 2 years back:

I don’t think any of those ideas actually got implemented though.

For the Nvidia case in particular, we might be able to get by in the short term making the last compatible user space available to snaps built on top of 16.04 libraries. But the Nvidia drivers have historically had a close coupling between the kernel and user space portions: there’s no guarantee how long that would continue to work as the host system moves ever forward.

ppd · October 13, 2021, 1:22pm

Is there any work going on in this area? That’s a very pressing problem, isn’t it? Snaps with older bases cannot satisfy this dependency and just refuse to launch.

See e.g. How to set LD_LIBRARY_PATH properly?

tamas · October 19, 2021, 1:02pm

Having gained a better understanding after reading the above threads, I do concur: isn’t this a Very Big Problem? Sounds like this could prevent us from shipping an EGL-using snap entirely.

tamas · October 20, 2021, 11:23am

As a desperate measure, I added the graphics-core20 interface and bundled the EGL libs from the system and it seems to work. Probably by accident, but hopefully that helps someone.

Wimpress · October 20, 2021, 12:51pm

What is this? - EDIT Ignore this question. I see the reference above

My workaround is to coerce LD_LIBRARY_PATH in a wrapper that is last in the command-chain. The wrapper pushes ${SNAP}/usr/lib/${SNAP_LAUNCHER_ARCH_TRIPLET} to the front of LD_LIBRARY_PATH.

The OBS snap now starts on my NVIDIA systems but I am not sure how thin is the ice upon which I skate

tamas · October 20, 2021, 1:27pm

Yeah I think that ultimately has the same effect. I also have the same feeling, this feels like a potential trouble source.

Wimpress · October 20, 2021, 2:01pm

I’d like to thank @alan_g and other members of the Mir team for working on graphics-core20

I’ve added graphics-core20 to a local branch of the OBS Studio snap, the diff is included below as it might be useful to others.

One thing to note, is I had to expose the environment variable coercion in an existing wrapper I use to launch OBS, as they didn’t take affect when added to environment: stanzas in the snapcraft.yaml. The wrapper, is also the last script in the command-chain. Here’s the diff:

diff --git a/snap/local/obs-wrapper b/snap/local/obs-wrapper
index 06ceb43..9936fc5 100755
--- a/snap/local/obs-wrapper
+++ b/snap/local/obs-wrapper
@@ -33,5 +33,11 @@ if [[ ${@} == *"usr/bin/obs"* ]]; then
   fi
 fi
 
+# Support for graphics-core20
+export LD_LIBRARY_PATH="${SNAP}/graphics/lib:${LD_LIBRARY_PATH}"
+export LIBGL_DRIVERS_PATH="${SNAP}/graphics/dri"
+export LIBVA_DRIVERS_PATH="${SNAP}/graphics/dri"
+export __EGL_VENDOR_LIBRARY_DIRS="${SNAP}/graphics/glvnd/egl_vendor.d"
+
 unset SESSION_MANAGER
 exec "${@}"
diff --git a/snap/snapcraft.yaml b/snap/snapcraft.yaml
index d017bac..10df27b 100644
--- a/snap/snapcraft.yaml
+++ b/snap/snapcraft.yaml
@@ -9,6 +9,12 @@ architectures:
 compression: lzo
 
 plugs:
+  # Support for graphics-core20
+  # https://discourse.ubuntu.com/t/the-graphics-core20-snap-interface/23000
+  graphics-core20:
+    interface: content
+    target: $SNAP/graphics
+    default-provider: mesa-core20
   # Support for common GTK themes
   # https://forum.snapcraft.io/t/how-to-use-the-system-gtk-theme-via-the-gtk-common-themes-snap/6235
   gtk-3-themes:
@@ -43,8 +49,12 @@ layout:
     symlink: $SNAP/usr/lib/$SNAPCRAFT_ARCH_TRIPLET/libvulkan_radeon.so
   /usr/share/alsa:
     symlink: $SNAP/usr/share/alsa
-  /usr/share/libdrm/amdgpu.ids:
-    symlink: $SNAP/usr/share/libdrm/amdgpu.ids
+  # Used by mesa-core20 for app specific workarounds
+  /usr/share/drirc.d:
+    bind: $SNAP/graphics/drirc.d
+    # Needed by mesa-core20 on AMD GPUs
+  /usr/share/libdrm:
+    bind: $SNAP/graphics/libdrm
   /usr/share/obs:
     symlink: $SNAP/usr/share/obs
   /usr/share/X11:
@@ -1159,6 +1169,8 @@ parts:
 
   cleanup:
     plugin: nil
+    build-snaps:
+      - mesa-core20
     after:
       - aom
       - cef
@@ -1180,11 +1192,14 @@ parts:
         usr/share/GConf \
         usr/share/apport \
         usr/share/bug \
+        usr/share/drirc.d \
         usr/share/fonts \
+        usr/share/glvnd \
         usr/share/icons/Adwaita \
         usr/share/icons/Humanity* \
         usr/share/icons/LoginIcons \
         usr/share/icons/ubuntu-mono-* \
+        usr/share/libdrm \
         usr/share/lintian \
         usr/share/man \
         usr/share/pkgconfig; do
@@ -1195,3 +1210,8 @@ parts:
       rm -rf ${SNAPCRAFT_PRIME}/usr/share/doc/*/examples || true
       rm ${SNAPCRAFT_PRIME}/usr/share/doc/*/README* 2>/dev/null || true
       find ${SNAPCRAFT_PRIME}/usr -type d -empty -delete || true
+
+      # graphics-core20 cleanup
+      cd /snap/mesa-core20/current/egl/lib
+      find . -type f,l -exec rm -f $SNAPCRAFT_PRIME/usr/lib/${SNAPCRAFT_ARCH_TRIPLET}/{} \;
+      rm -fr "$SNAPCRAFT_PRIME/usr/lib/${SNAPCRAFT_ARCH_TRIPLET}/dri"

I was also able to get the OBS Studio snap running simply pushing ${SNAP}/usr/lib/${SNAP_LAUNCHER_ARCH_TRIPLET} to the front of LD_LIBRARY_PATH. Here’s the diff.

diff --git a/snap/local/obs-wrapper b/snap/local/obs-wrapper
index 06ceb43..da2e38d 100755
--- a/snap/local/obs-wrapper
+++ b/snap/local/obs-wrapper
@@ -33,5 +33,7 @@ if [[ ${@} == *"usr/bin/obs"* ]]; then
   fi
 fi
 
+export LD_LIBRARY_PATH="${SNAP}/usr/lib/${SNAP_LAUNCHER_ARCH_TRIPLET}:${LD_LIBRARY_PATH}"
+
 unset SESSION_MANAGER
 exec "${@}"

My question for @alan_g @ijohnson and @jamesh is which of the above approaches is the most robust?

alan_g · October 20, 2021, 2:59pm

I’m not sure what the graphics-core20 interface has to do with the Nvidia/Impish problems. Admittedly, libEGL.so is involved in both but that’s no different to e.g. having it included in the snap. (Which it may well be already.)

I assume some other script in the command-chain or snapd is making the change. AIIUI snapd prepends the host Nvidia driver path so that binaries from there are found first. But unfortunately, with a core20 based snap these are incompatible with the base core20 libc.

Wimpress:

+  # Support for graphics-core20
+  # https://discourse.ubuntu.com/t/the-graphics-core20-snap-interface/23000
+  graphics-core20:
+    interface: content
+    target: $SNAP/graphics
+    default-provider: mesa-core20

Adding Mesa drivers might work for some cases, but I doubt that, for example, hardware decoding of video will be working.

None of this sounds particularly robust, but if it works for you, then great!

I’ve no deep knowledge of when and how snapd injects host GL binaries into the environment, but wouldn’t stripping them from LD_LIBRARY_PATH/LIBGL_DRIVERS_PATH/LIBVA_DRIVERS_PATH/__EGL_VENDOR_LIBRARY_DIRS in your wrapper script be simpler and as effective?

ijohnson · October 20, 2021, 3:25pm

I don’t have strong opinions here except to say that snapd shouldn’t do anything with LD_LIBRARY_PATH for apps at runtime except that we clean the value from the host before executing snap-confine, so i.e. doing LD_LIBRARY_PATH=foo snap run foobar, foobar when executed will not see the LD_LIBRARY_PATH value the same way it will for something like DISPLAY etc

$ LD_LIBRARY_PATH=foo snap run --shell hello-world -c 'echo $LD_LIBRARY_PATH'

$ LD_LIBRARY_PATH2=foo snap run --shell hello-world -c 'echo $LD_LIBRARY_PATH2'
foo