Small update. Per @niemeyer’s advice to try a simpler approach, I’ve set up a xenial chroot, dumped the libglvnd I built before and copied over nvidia drivers that came from Arch packages. I was able to reproduce the segfault without much trouble. The backtrace:
(gdb) bt
#0 0x00007ffff7559fb4 in pthread_mutex_lock (mutex=0x7ffff71e5180 <dispatchLock>) at forward.c:192
#1 0x00007ffff6f5eddb in mt_mutex_lock (mutex=0x7ffff71e5180 <dispatchLock>) at glvnd_pthread.c:317
#2 0x00007ffff6f23f77 in LockDispatch () at GLdispatch.c:144
#3 0x00007ffff6f24115 in __glDispatchNewVendorID () at GLdispatch.c:198
#4 0x00007ffff7212607 in __glXLookupVendorByName (vendorName=0x618ad0 "nvidia") at libglxmapping.c:442
#5 0x00007ffff7213811 in __glXLookupVendorByScreen (dpy=0x60aab0, screen=0) at libglxmapping.c:574
#6 0x00007ffff7213966 in __glXGetDynDispatch (dpy=0x60aab0, screen=0) at libglxmapping.c:608
#7 0x00007ffff7209563 in glXChooseVisual (dpy=0x60aab0, screen=0, attrib_list=0x609200) at libglx.c:215
#8 0x00007ffff7b89d58 in glXChooseVisual (dpy=0x60aab0, screen=0, attribList=0x609200) at g_libglglxwrapper.c:183
#9 0x0000000000401741 in ?? ()
#10 0x00007ffff7465830 in __libc_start_main (main=0x401630, argc=1, argv=0x7fffffffe608, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe5f8) at ../csu/libc-start.c:291
#11 0x0000000000401ea9 in ?? ()
The upside is that at least I can install the usual debugging tools now and try to dig deeper.
Turns out nvidia ships a couple of libraries that may fiddle with TLS or at leat that’s what the name libnvidia-tls.so* suggests. There are 2 copies of the libraries (at least on Arch), one under /usr/lib, and another under /usr/lib/tls:
The libraries under tls have different checnksum than those one level up. Since the location should not matter for ld.so and we prepend the whole /var/lib/snapd/lib/gl path, I ignored those files. But, copying over the /usr/lib/tls magically fixed the problem, no more segfaults, glxinfo works, and so does ohmygiraffe.
I believe I’m experiencing the same problem in openSuse Tumbleweed: spotify fails to start because it fails to create the GL context.
[roman:~] % snap run spotify
/home/roman/Downloads was removed, reassigning DOWNLOAD to homedir
Gtk-Message: Failed to load module "canberra-gtk-module"
ATTENTION: default value of option force_s3tc_enable overridden by environment.
libGL error: No matching fbConfigs or visuals found
libGL error: failed to load driver: swrast
[1] 14195 trace trap (core dumped) snap run spotify
[roman:~] 133 % [0615/223908.654302:ERROR:gl_context_glx.cc(227)] Couldn't make context current with X drawable.
[0615/223908.654322:ERROR:gpu_info_collector.cc(62)] gl::GLContext::MakeCurrent() failed
I did snap refresh --beta core, but that didn’t help.
Finally found some time to install TW. I’m using the latest package avaible from this repo https://build.opensuse.org/package/show/home:zyga:branches:system:snappy/snapd which is 2.33.1-13.1 at the moment. I’m using the same version of nvidia driver as you are. So far I have seen no issues, spotify (1.0.80.474.gef6b503e-7, rev 16), ohmygiraffe (1.1.0a, rev 3), my gl debugging snap all work fine.