GPU Support Proposal


#1

Hello

I’m working towards a demo/prototype that will give us more information about how to proceed with openGL/openCL/Vulkcan/vdapu/cuda runtimes so that snap applications can benefit from drivers that were created after the instant the snap application was made.

Currently snapd contains a bit of code for special support of the NVIDIA user space libraries provided by the host. The system is fragile and complex and only works with NVIDIA. We’d like to expand that to something more generic while making it simpler to support at the same time.

I’ve been conducting some research over the past few days and I’d like to share my straw man plans.

GPU Support Proposal

  • snapd team makes two prototype snaps snapd-nvidia-418 and snapd-core18-mesa for the amd64 architecture
    • later on: additional NVIDIA driver versions
    • later on: support for i386 architecture
    • later on: support for core16
    • later on: support for Wayland
    • later on: support for Radeon Pro drivers via snapd-core18-amdpro
  • prototypes provide gpu-support slot with the following directories:
    • opengl-runtime - this ships libGL.so and dependencies
    • opencl-runtime - TBD
    • vdapu-runtime - TBD
    • vulcan-runtime - TBD
    • cuda-runtime (nvidia-specific) - TBD
    • libraries common across this set should be placed in a dedicated directory
  • snapd probes presence of loaded nvidia kernel driver
    • no need to handle PCI IDs or ask the store for anything
    • the host system provides the kernel driver and udev rules
    • snapd installs either snapd-nvidia-418 or snapd-core18-mesa
  • cooperating application snaps provide gpu-support plug
    • snaps using classic confinement are not supported yet
    • when disconnected bundled libraries are used, like today
    • when disconnected NVIDIA from the host works like today
    • when connected startup logic orders LD_LIBRARY_PATH from gpu-support connection ahead of $SNAP_LIBRARY_PATH and internal bundled libraries.

Supporting research:


Hello World CUDA Analysis
Nvidia CUDA on Ubuntu Core
#2

I understand it’s just a demo, but could we find a way that would not require declaring a new plug in snap.yaml?


#3

Perhaps yes but I wanted to focus on something that is explicitly opt-in before we venture further:

We could define the plug automatically from snapd, on all the snaps that have one of the other plugs (e.g. OpenGL). We could then use the new wrapper replacement (forgive me, the naming of the feature eludes me now) where snapd would inject the GPU aware LD_LIBRARY_PATH changes automatically.

The downside of doing this like that is that it is done without cooperation with the snap developer and without a way for users to opt out. Perhaps that’s okay but I wanted to avoid that for now.


#4

Perhaps i386 is really popular, but I personally think seeing this supported on the Jetson Nano (which is arm64) would be more useful/interesting than i386. Just my 2 cents


#5

The Nvidia binary driver distribution does not support ARM AFAIK. Perhaps there’s a separate bundle (and I’d love to get the Jetson Nano but it is not available yet) but this needs to be investigated with an actual device in hand.


#6

Actually I’m not sure if this is useful as really with the Nano you’re just using CUDA not actually the Nvidia driver (think headless IoT systems), so I don’t know if it’s super useful to be able to have the Nvidia driver available to snaps running on the Nano. The one example I can think of where it would be useful to use the Nvidia driver itself would be in digital signage situations where you are using the GPU on the Nano to drive a 4K digital signage display or something.


#7

The main motivation of this is to ensure application snaps have high longevity by allowing them to work on hardware made later than the application itself. A common example would be allowing to inject updated MESA into the execution path of an old game so that it can run on future intel GPUs.


#8

+1 here. Note that the CUDA kernel drivers for the Jetson devices are fully open source and included as part of the kernels provided by Nvidia, although unsurprisingly userspace libraries are closed. With latest changes I added to the OpenGL interface CUDA applications can run on it, however this bug interferes when using UC18:

https://bugs.launchpad.net/snapd/+bug/1821023

The Jetson Nano is already available (and so cheap!) so probably it would be worth getting one :slightly_smiling_face:. We already have a Core image for it too.

I would also like to mention that the Nano is most probably going to become the RPi of AI/ML…


#9

I will get the nano as soon as it is available in Poland and use it to ensure that CUDA works great out of the box with snapd.

EDIT: Ordered now, I should have it in two weeks so after Lyon.


#10

On the GL side I just want to mention we’re bundling the mesa userspace libraries with both “application” snaps and “server” snaps (both confined and classic).

If you need an example on which to base experiments I suggest the mir-test-tools snap. https://github.com/MirServer/mir-test-tools - this bundles up a variety of clients with a test server.