Mir snap not detecting new input devices

Hi folks,
The bug: the Mir-kiosk snap is failing to detect new input devices. Those devices plugged in when starting the snap are detected, it is only newly added devices which are not.

I can reproduce on Ubuntu Core and Classic. I’m using Classic for ease of debugging.

Steps to repro:

  1. sudo snap install mir-kiosk --edge --devmode
    Mir will auto-start. Note: Mir will try to use VT1, which may collide with GDM. Workaround is to edit /snap/mir-kiosk/current/bin/run-miral and specify a different VT.

  2. Plug in a new USB mouse and wiggle it. Mir’s mouse cursor should move, but it doesn’t.

Facts I’ve learned
A. After digging into Mir/libinput/udev/evdev, I’ve learned the udev is detecting the new device, libinput is getting evdev to open the new input device node (/dev/input/event*) but that open fails. That’s the reason Mir fails to use the new mouse.

To see this, run sudo gdb -p $(pidof miral-kiosk) after Mir has started up, and break on the open syscall

 > catch syscall open
 > c

If you plug in a device, this will catch on trying to open the input device node for the newly added USB mouse. Here’s a cleaned up backtrace with symbols: http://pastebin.ubuntu.com/25918597/, with the path it is attempting to open printed on frame 4. Enter finish to have that method complete, and p $rax to print the return value. p errno to get the error code:

(gdb) fin
Run till exit from #0  0x00007f564fe9802d in open64 () at ../sysdeps/unix/syscall-template.S:84

Thread 4 "Mir/Input Reade" hit Catchpoint 1 (returned from syscall open), 0x00007f564fe9802d in open64 () at    ../sysdeps/unix/syscall-template.S:84
84	in ../sysdeps/unix/syscall-template.S
(gdb) p $eax
$1 = -1         // the FD returned is invalid
(gdb) p errno
$2 = 2          // ENOENT, aka "No such file or directory."

I don’t understand why, as I see that file just fine.

$ ls -l /dev/input/event5
crw-rw---- 1 root input 13, 69 Nov  8 16:07 /dev/input/event5

B. To compare with a working Mir, this command will work:

sudo LD_LIBRARY_PATH=/snap/mir-kiosk/current/usr/lib/x86_64-linux-gnu:/snap/mir-libs/current/mirlibs0/usr/lib/x86_64-linux-gnu/:/snap/mir-kiosk/current/usr/lib/x86_64-linux-gnu/mesa-egl MIR_SERVER_PLATFORM_PATH=/snap/mir-kiosk/current/usr/lib/x86_64-linux-gnu/mir/server-platform/ /snap/mir-kiosk/current/usr/bin/miral-kiosk

Doing the same test as above, you should see that the open syscall above will succeed, and the new mouse will be recognised by Mir. (ofc they’re not identical comparisons, as this isn’t using the core libs from snap core, but locally available ones)

C. dmesg or sudo /snap/bin/snappy-debug.security scanlog isn’t giving me anything, but as I’m using devmode, I didn’t expect much help there.

D. The mir interface does allow access to /dev/input/event[0-9]

E. Using strace, I do not to see Mir doing the open syscall at all. Yet GDB can - presumably it intercepts at the glibc level, but still, what??

So I’m once again confused why Mir is unable to open the device node for the newly added USB device. I’ve tried adding the hardware-observer plug to no effect.

Suggestions/ideas would be appreciated.

@zyga-snapd asked for the output of /sys/fs/cgroup/devices/snap.mir-kiosk.mir-kiosk/devices.list

I do not see it changing when I plug/unplug my USB mouse, if that matters.

I didn’t know cgroups could confine devices like this, is a new avenue of investigation. Thanks!

The major/minor of the USB mouse is 13:69. Got from

udevadm info -rq name /sys/dev/shar/13:69 

returning the correct thing: /dev/input/event5. So appears this devices whitelist isn’t being updated.

Here is the udevadm log on insertion and removal of USB mouse:

Thanks for these details. @zyga-snapd is looking into this right now.

I took this over from @zyga-snapd.

I can confirm this issue by attaching a virtio mouse in a VM by using:

# hotplug a device
$ virsh qemu-monitor-command snappy-16-amd64 '{"execute":"device_add","arguments":{"driver":"virtio-mouse-pci","id":"input1","bus":"pci.0","addr":"0xa"}}'

# hotunplug a device
$ virsh qemu-monitor-command snappy-16-amd64 '{"execute":"device_del","arguments":{"id":"input1"}}'

Adding the device shows it comes up in udevadm monitor:

$ udevadm monitor
monitor will print the received events for:
UDEV - the event which udev sends out after rule processing
KERNEL - the kernel uevent

KERNEL[3067.716884] add      /devices/pci0000:00/0000:00:0a.0 (pci)
UDEV  [3067.717008] add      /devices/pci0000:00/0000:00:0a.0 (pci)
KERNEL[3067.736452] add      /devices/pci0000:00/0000:00:0a.0/virtio4 (virtio)
UDEV  [3067.737617] add      /devices/pci0000:00/0000:00:0a.0/virtio4 (virtio)
KERNEL[3067.739057] add      /devices/pci0000:00/0000:00:0a.0/virtio4/input/input20 (input)
UDEV  [3067.741710] add      /devices/pci0000:00/0000:00:0a.0/virtio4/input/input20 (input)
KERNEL[3067.742489] add      /devices/pci0000:00/0000:00:0a.0/virtio4/input/input20/mouse2 (input)
UDEV  [3067.743512] add      /devices/pci0000:00/0000:00:0a.0/virtio4/input/input20/mouse2 (input)
KERNEL[3067.743596] add      /devices/pci0000:00/0000:00:0a.0/virtio4/input/input20/event4 (input)
UDEV  [3067.745421] add      /devices/pci0000:00/0000:00:0a.0/virtio4/input/input20/event4 (input)

The two devices are correctly udev tagged:

$ udevadm info /sys/devices/pci0000:00/0000:00:0a.0/virtio4/input/input20/mouse2
P: /devices/pci0000:00/0000:00:0a.0/virtio4/input/input20/mouse2
N: input/mouse2
S: input/by-path/virtio-pci-0000:00:0a.0-mouse
E: DEVLINKS=/dev/input/by-path/virtio-pci-0000:00:0a.0-mouse
E: DEVNAME=/dev/input/mouse2
E: DEVPATH=/devices/pci0000:00/0000:00:0a.0/virtio4/input/input20/mouse2
E: ID_PATH=virtio-pci-0000:00:0a.0
E: ID_PATH_TAG=virtio-pci-0000_00_0a_0
E: ID_SERIAL=noserial
E: TAGS=:snap_mir-kiosk_mir-kiosk:
E: net.ifnames=0

$ udevadm info /sys/devices/pci0000:00/0000:00:0a.0/virtio4/input/input20/event4
P: /devices/pci0000:00/0000:00:0a.0/virtio4/input/input20/event4
N: input/event4
S: input/by-path/virtio-pci-0000:00:0a.0-event-mouse
E: DEVLINKS=/dev/input/by-path/virtio-pci-0000:00:0a.0-event-mouse
E: DEVNAME=/dev/input/event4
E: DEVPATH=/devices/pci0000:00/0000:00:0a.0/virtio4/input/input20/event4
E: ID_PATH=virtio-pci-0000:00:0a.0
E: ID_PATH_TAG=virtio-pci-0000_00_0a_0
E: ID_SERIAL=noserial
E: TAGS=:snap_mir-kiosk_mir-kiosk:
E: net.ifnames=0

but the devices are not added to the device cgroup:

$ grep 'c 13:34 rwm' /sys/fs/cgroup/devices/snap.mir-kiosk.mir-kiosk/devices.list 
$ grep 'c 13:68 rwm' /sys/fs/cgroup/devices/snap.mir-kiosk.mir-kiosk/devices.list

restarting the snap makes them show up:

$ grep 'c 13:34 rwm' /sys/fs/cgroup/devices/snap.mir-kiosk.mir-kiosk/devices.list 
c 13:34 rwm
$ grep 'c 13:68 rwm' /sys/fs/cgroup/devices/snap.mir-kiosk.mir-kiosk/devices.list 
c 13:68 rwm

The problem seems to be with /lib/udev/snappy-app-dev and/or /lib/udev/rules.d/80-snappy-assign.rules. I’m continuing to look into this.

1 Like

I’ve discovered that this isn’t working correctly.

$ cat /lib/udev/rules.d/80-snappy-assign.rules 
# add/remove snap package access to assigned devices
TAG=="snap_*", RUN+="/lib/udev/snappy-app-dev $env{ACTION} $env{TAG} $devpath $major:$minor"

The TAG=="snap_*" doesn’t seem to be matching and if I change it to ‘snap_mir-kiosk_mir-kiosk’ then $env{TAG} is empty.

I can say that if I add this rule:

TAG=="snap_mir-kiosk_mir-kiosk", RUN+="/lib/udev/snappy-app-dev $env{ACTION} snap_mir-kiosk_mir-kiosk $devpath $major:$minor"

then it works. This is something that we could fix in snapd by removing the /lib/udev/rules.d/80-snappy-assign.rules file altogether and adjusting the udev backend to add the above. I’m not sure this is the best approach, so will investigate before sending up a PR.

@jdstrand FWIW, if the change in the backend would make all similar cases work, it certainly sounds fine. It’s actually preferable to have this in the backend, where most of the logic for this already sits, than in shell scripts lost through the filesystem.

The shell script would still be needed to update the cgroup-- what we would drop is the generic udev rule. It’s long been desired to get rid of the shell script though, so perhaps as part of this work I could rewrite it in C. I don’t think this is 2.29 material anyway.

Mostly I want to understand why this broke, to verify that it will work on 14.04 through 17.10/18.04 and to do some performance measurements (I suspect this will be easier on udev since globs with queries can slow things down aiui; that said a glob that’s fast but broken doesn’t do us much good :).

It doesn’t seem much of an advantage in this specific case to have a tiny helper in C instead of shell. It would be an advantage if we could kill the helper altogether in favor of integration with the backend.

In either case, thanks for looking into this. Great to have your experienced eyes here.

Now that I think of it, I think it could be a tiny go program. We aren’t doing anything fancy here-- just processing some input and writing to a file.

The advantage of moving away from the shell is that shells are quite sensitive to environment variables, quoting, etc and snappy-add-dev is called by both udev (controlled input) and snap-confine (uncontrolled). While not strictly required since we’ve reviewed the invocations carefully (this is why we haven’t done this with priority), converting to a binary would be a hardening measure. We can also have nicer debugging, etc.

any progress on this?

It’s assigned to me. It got preempted by some other work/holiday but is very high in the queue.

any progress on this?

It’s on the roadmap for 2.30.

@greyback and @alan_g - I’ve submitted https://github.com/snapcore/snapd/pull/4374 for this and local VM testing with mir-kiosk shows the device being added to/removed from the device cgroup on hotplug/hotunplug. It would be great if someone from the mir team could also test.

I wasn’t able to locate the offending commit, but did verify that systemd 204 (trusty, though vivid should operate the same way since we designed this on vivid) could use TAG=="snap_foo_*", RUN+="/usr/bin/logger test" and fire off logger whenever something tagged matching the glob came up or down, but on systemd 229 (xenial), it doesn’t work. The technique I proposed was tested to work on 204, 229 and 234.

Done, worked for me! Thank you

1 Like