Nvidia acceleration on chrome and firefox

Spent some time debugging this and I’m close to being stuck.

What I did (IOW diary of a madman):

(gdb) bt
#0  0x00007ffff7559fb4 in pthread_mutex_lock () from target:/lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff6f5eddb in mt_mutex_lock (mutex=0x7ffff71e5180 <dispatchLock>) at glvnd_pthread.c:317
#2  0x00007ffff6f23f77 in LockDispatch () at GLdispatch.c:144
#3  0x00007ffff6f24115 in __glDispatchNewVendorID () at GLdispatch.c:198
#4  0x00007ffff7212607 in __glXLookupVendorByName (vendorName=0x60d160 "nvidia") at libglxmapping.c:442
#5  0x00007ffff7213811 in __glXLookupVendorByScreen (dpy=0x60aab0, screen=0) at libglxmapping.c:574
#6  0x00007ffff7213966 in __glXGetDynDispatch (dpy=0x60aab0, screen=0) at libglxmapping.c:608
#7  0x00007ffff7209563 in glXChooseVisual (dpy=0x60aab0, screen=0, attrib_list=0x609200) at libglx.c:215
#8  0x00007ffff7b89d58 in glXChooseVisual (dpy=0x60aab0, screen=0, attribList=0x609200) at g_libglglxwrapper.c:183
#9  0x0000000000401741 in ?? ()
#10 0x00007ffff7465830 in __libc_start_main () from target:/lib/x86_64-linux-gnu/libc.so.6
#11 0x0000000000401ea9 in ?? ()
  • reordered __glDispatchNewVendorID to be called right after memcpy
  • got stack smashing and ABRT higher up the stack, brilliant idea, rebuild with stack protector and see if the problem happens earlier
  • rebuilt libglvnd with CFLAGS='-O0 -ggdb -fstack-protector -fstack-protector-all'
  • dump
  • get stack smashing again
  • gdb again, ended up with this:
B+ │0x7ffff72117bb <__glXLookupVendorByName+19>     mov    %fs:0x28,%rax                                              │
   │0x7ffff72117c4 <__glXLookupVendorByName+28>     mov    %rax,-0x18(%rbp)                                           │
                                                canary set ^^^
  >│0x7ffff72117c8 <__glXLookupVendorByName+32>     xor    %eax,%eax

(gdb) print/x $rax
$2 = 0xe8e2f5458058dc00
(gdb) print/x *(uint64_t*)($rbp -0x18)
$4 = 0xe8e2f5458058dc00
(gdb) print/x *(uint64_t*)(0x7fffffffde28)
$5 = 0xe8e2f5458058dc00
(gdb) print/x $rax
$7 = 0xe8e2f5458058dc00  <--- %fs:0x28
# try to catch return
(gdb) b libglxmapping.c:509
Breakpoint 4 at 0x7ffff72135fc: file libglxmapping.c, line 509.
(gdb) c
Continuing.

Thread 1 "glxinfo" hit Breakpoint 4, __glXLookupVendorByName (vendorName=0x60d160 "nvidia") at libglxmapping.c:509

  >│0x7ffff721365e <__glXLookupVendorByName+7862>   mov    -0x18(%rbp),%rbx                                           │
   │0x7ffff7213662 <__glXLookupVendorByName+7866>   xor    %fs:0x28,%rbx                                              │
   │0x7ffff721366b <__glXLookupVendorByName+7875>   je     0x7ffff7213672 <__glXLookupVendorByName+7882>              │
                                       canary check ^^^
   │0x7ffff721366d <__glXLookupVendorByName+7877>   callq  0x7ffff7208cf0 <__stack_chk_fail@plt>

(gdb) print/x *(uint64_t*)(0x7fffffffde28)  <-- canary unchanged?
$8 = 0xe8e2f5458058dc00
(gdb) stepi
(gdb) print/x $rbx
$9 = 0xe8e2f5458058dc00
(gdb) stepi
(gdb) print/x $rbx
$10 = 0xe8e28aba75a88cc0 <-- xor %fs:0x28,%rbx, $fs:0x28 must have been 0x7ffff5f050c0 now?

From what I understand $fs is related to TLS. The problem may then go deeper into libc and pthread.
Would appreciate any advice on what to try next.

1 Like