Snaps and OpenCL

I am having trouble with OpenCL apps.
The snap version of darktable and LibreOffice (and probably others) does not find the OpenCL libraries, but the deb versions of said applications works fine with OpenCL.

Is there something I should do to make them be able to use OpenCL?

I am using Ubuntu 18.10 with the latest Intel OpenCL Neo library installed.

Strict snaps typically have to ship their libs inside the snap package, to have OpenCL support the libs would have to be added to to the packages respectively, i dont think snaps can see your installed version of the latest Intel OpenCL Neo library.

All right, I understand.

But surely there should be an interface for this, shouldn’t it? I mean OpenCL libraries are machine-dependant and more similar to device drivers than to any other library a application may depend on.

To me (may be I am missing something obvious, though) to ask individual apps to ship OpenCL libraries is akin to ask them to ship mesa and Nvidia’S opengl libraries…

Thanks for reporting the problem. The situation is what @ogra described. The snap should ship with OpenCL libraries inside, especially if those are open source libraries. The situation is similar to Mesa in this case. From what I was told, the kernel interfaces used by Mesa are fairly stable.

The story if Nvidia is a tad different. Since it’s a binary blob, there’s no control over how the userspace driver interacts with the kernel coutnerpart and we go though some hoops to make it work.

That being said, I think none of the interfaces currently allows OpenCL to work properly. I’ll look into this with my debugging snap, will try to adjust the code to make it work and report back.

On a side note. If the snap ends up shipping mesa OpenCL ICD files, those will end up inside $SNAP. Unfortunately ICD loader is rather strict about expecting the files under /etc/OpenCL/vendor, so I would expect the snap to use Snap layouts to work around that.

1 Like

from:

  1. if $OCL_ICD_VENDORS is a directory path, then this path replaces the “/etc/OpenCL/vendors” path in the standard behavior: the loader will use the .icd files in this directory;

so a simple:

apps:
  foobar:
    command: barfoo
    environment:
      OCL_ICD_VENDORS: $SNAP/etc/OpenCL/vendors
      ...
...

should be sufficient …

1 Like

Yeah, that’s one solution, though it only works with the non-reference OpenCL ICD loader library. The reference one does not support that. On the other hand, most distros seem to default to the non-reference one anyway.

Managed to identify and fix some issues. First things first, the PR allows the snap to access /etc/OpenCL/vendors and picks up nvidia’s OpenCL implementation libraries:
https://github.com/snapcore/snapd/pull/6160
It has already landed and should be available in edge. It will also be available in the next 2.36 patch release.

I’ve updated my debugging snap and added clinfo (OpenCL’s equivalent of glxinfo) and pushed the changes right here:


The changes are a bit hacky, but I tried to make both the implementations from the snap (mesa & beignet) and the host (nvidia in my case) available. We founds some bugs with layouts in the process. Anyways, I ended up using OCL_ICD_VENDORS as @ogra suggested, but there’s an additional step that combines ICD files from /etc/OpenCL/vendors (which comes from the host) and $SNAP/etc/OpenCL/vendors into a single hierarchy. Unfortunately I had to fall back to using opengl interface hook to do this.

Note about ICD files. For OpenCL, those contain just the name (or a relative or an absolute path) of the library for particular vendor. Nvidia and mesa use plain names (eg. libnvidia-opencl.so.1), but the Beignet library uses a hard coded path. I added a workaround to overwrite it, but it still doesn’t work as expected as the actual lib tries to dlopen() another one using a hardcoded path.

The updated snap was pushed to the store, you should be able to snap install --edge graphics-debug-tools-bboozzoo

2 Likes

Thank you @ogra and @mborzecki for working in thi issue!

I installed graphics-debug-tools-snap on my system and ran it. You’ll find the output at the end of this comment. (Should I also change to the edge channel in snapd?)

It seems to (try to?) load beignet, despite that I don’t have it installed on my system. I replaced beignet with Intel’s Neo OpenCL driver (https://github.com/intel/compute-runtime). It gives an error on the first line though, so I don’t know if OpenCL is functional.

Please let me know if there is something I could try to help you iron out this issue.

$ graphics-debug-tools-bboozzoo.clinfo
unable to load /usr/lib/x86_64-linux-gnu/beignet//libgbeinterp.so which is part of the driver, please check!
Number of platforms 2
Platform Name Intel Gen OCL Driver
Platform Vendor Intel
Platform Version OpenCL 2.0 beignet 1.3
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_spir cl_khr_icd cl_intel_accelerator cl_intel_subgroups cl_intel_subgroups_short cl_khr_gl_sharing
Platform Extensions function suffix Intel

Platform Name Clover
Platform Vendor Mesa
Platform Version OpenCL 1.1 Mesa 18.0.5
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd
Platform Extensions function suffix MESA

Platform Name Intel Gen OCL Driver
Number of devices 1
Device Name Intel® HD Graphics Kabylake ULT GT2
Device Vendor Intel
Device Vendor ID 0x8086
Device Version OpenCL 2.0 beignet 1.3
Driver Version 1.3
Device OpenCL C Version OpenCL C 2.0 beignet 1.3
Device Type GPU
Device Profile EMBEDDED_PROFILE
Device Available Yes
Compiler Available No
Linker Available Yes
Max compute units 24
Max clock frequency 1000MHz
Device Partition (core)
Max number of sub-devices 1
Supported partition types None, None, None
Max work item dimensions 3
Max work item sizes 512x512x512
Max work group size 512
Preferred / native vector sizes
char 16 / 8
short 8 / 8
int 4 / 4
long 2 / 2
half 0 / 8 (cl_khr_fp16)
float 4 / 4
double 0 / 2 (n/a)
Half-precision Floating-point support (cl_khr_fp16)
Denormals No
Infinity and NANs Yes
Round to nearest Yes
Round to zero No
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Single-precision Floating-point support (core)
Denormals No
Infinity and NANs Yes
Round to nearest Yes
Round to zero No
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Double-precision Floating-point support (n/a)
Address bits 32, Little-Endian
Global memory size 4039114752 (3.762GiB)
Error Correction support No
Max memory allocation 3029336064 (2.821GiB)
Unified memory for Host and Device Yes
Shared Virtual Memory (SVM) capabilities (core)
Coarse-grained buffer sharing Yes
Fine-grained buffer sharing No
Fine-grained system sharing No
Atomics No
Minimum alignment for any data type 128 bytes
Alignment of base address 1024 bits (128 bytes)
Preferred alignment for atomics
SVM 0 bytes
Global 0 bytes
Local 0 bytes
Max size for global variable 65536 (64KiB)
Preferred total size of global vars 65536 (64KiB)
Global Memory cache type Read/Write
Global Memory cache size 8192 (8KiB)
Global Memory cache line size 64 bytes
Image support Yes
Max number of samplers per kernel 16
Max size for 1D images from buffer 65536 pixels
Max 1D or 2D image array size 2048 images
Base address alignment for 2D image buffers 4096 bytes
Pitch alignment for 2D image buffers 1 pixels
Max 2D image size 8192x8192 pixels
Max 3D image size 8192x8192x2048 pixels
Max number of read image args 128
Max number of write image args 8
Max number of read/write image args 8
Max number of pipe args 16
Max active pipe reservations 1
Max pipe packet size 1024
Local memory type Local
Local memory size 65536 (64KiB)
Max number of constant args 8
Max constant buffer size 134217728 (128MiB)
Max size of kernel argument 1024
Queue properties (on host)
Out-of-order execution No
Profiling Yes
Queue properties (on device)
Out-of-order execution Yes
Profiling Yes
Preferred size 16384 (16KiB)
Max size 262144 (256KiB)
Max queues on device 1
Max events on device 1024
Prefer user sync for interop Yes
Profiling timer resolution 80ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels Yes
SPIR versions 1.2
printf() buffer size 1048576 (1024KiB)
Built-in kernels __cl_copy_region_align4;__cl_copy_region_align16;__cl_cpy_region_unalign_same_offset;__cl_copy_region_unalign_dst_offset;__cl_copy_region_unalign_src_offset;__cl_copy_buffer_rect;__cl_copy_image_1d_to_1d;__cl_copy_image_2d_to_2d;__cl_copy_image_3d_to_2d;__cl_copy_image_2d_to_3d;__cl_copy_image_3d_to_3d;__cl_copy_image_2d_to_buffer;__cl_copy_image_3d_to_buffer;__cl_copy_buffer_to_image_2d;__cl_copy_buffer_to_image_3d;__cl_fill_region_unalign;__cl_fill_region_align2;__cl_fill_region_align4;__cl_fill_region_align8_2;__cl_fill_region_align8_4;__cl_fill_region_align8_8;__cl_fill_region_align8_16;__cl_fill_region_align128;__cl_fill_image_1d;__cl_fill_image_1d_array;__cl_fill_image_2d;__cl_fill_image_2d_array;__cl_fill_image_3d;
Device Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_spir cl_khr_icd cl_intel_accelerator cl_intel_subgroups cl_intel_subgroups_short cl_khr_gl_sharing cl_khr_fp16

Platform Name Clover
Number of devices 0

NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, …) Intel Gen OCL Driver
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, …) Success [Intel]
clCreateContext(NULL, …) [default] Success [Intel]
clCreateContext(NULL, …) [other] <error: no devices in non-default plaforms>
clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) Success (1)
Platform Name Intel Gen OCL Driver
Device Name Intel® HD Graphics Kabylake ULT GT2
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1)
Platform Name Intel Gen OCL Driver
Device Name Intel® HD Graphics Kabylake ULT GT2
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1)
Platform Name Intel Gen OCL Driver
Device Name Intel® HD Graphics Kabylake ULT GT2

ICD loader properties
ICD loader Name OpenCL ICD Loader
ICD loader Vendor OCL Icd free software
ICD loader Version 2.2.11
ICD loader Profile OpenCL 2.1

This is expected. I didn’t have time to try to rebuild it reasonably.

From the log, there’s 2 platforms, one Mesa and the other based on Beignet, both come from the snap.

AFAICT OpenCL should be functional. The NULL platform behavior block tells you that generic Intel OCL Driver will be picked by default (beignet).

That being said, the snap has to deliver OpenCL libs itself. In this case, either darktable or LibreOffice snaps should be adjusted. For starters, it’s not even clear if the apps in those snaps are built with OpenCL support and have the necessary OpenCL ICD loader library installed.

So, if I understand correctly, a snap will only use the OpenCL drivers it is shipped with.

In this case, there is no way to make snaps use the Intel’s Neo driver (which is much better than beignet) without each app bundling it?

Does OpenGL apps work in the same way? I can only use the mesa libs bundled with the snap even if I update mesa in my PC?

Sorry to bump this again, but I just installed darkable 3 via snap and it is another app where OpenCL is not supported when installing via snap. (To be clear, OpenCL works fine if I install darktable via .deb).

OpenCL does make a noticeable difference in darktable so it is a big bummer, and a dealbreaker for me.

Edit: issue reported to the developer of the app: https://github.com/kyrofa/darktable-snap/issues/17, however it seems to be a limitation of snap itself.

I’d love to support it in darktable, but I have yet to see clear guidance as to what I should do to enable opencl for every GPU.

1 Like

If I were to add an nVidia card into my box that would mean I’d have all three GFX vendors covered in the one system. Is there a way I can use such a beast to easily test out each vendor’s OpenCL support in snaps to come up with general guidance without having to keep shutting down and getting the screwdriver out?

1 Like

I managed to get OpenCL working against the open source AMD graphics drivers (MESA) with the following additions to snapcraft.yaml:

layout:
  /usr/include/clc:
    bind: $SNAP/usr/include/clc
  /usr/lib/clc:
    bind: $SNAP/usr/lib/clc
  /usr/lib/$SNAPCRAFT_ARCH_TRIPLET/gallium-pipe:
    bind: $SNAP/usr/lib/$SNAPCRAFT_ARCH_TRIPLET/gallium-pipe

parts:
  opencl:
    plugin: nil
    stage-packages:
      - mesa-opencl-icd
      - ocl-icd-libopencl1
4 Likes

This is great. What version of OpenCL does it provide though? My understanding was that the OpenCL version provided by Mesa was far behind that which is provided by AMDGPU-PRO.

Hopefully a solution for up-to-date OpenCL can be found eventually, since this would be a major stalling point for snapping apps like DaVinci Resolve - which really is the ideal candidate for a Snap as it’s a monolithic proprietary application which currently offers very poor support on anything that isn’t an outdated version of CentOS.

It reports the following from clinfo:

clinfo output
Number of platforms                               1
  Platform Name                                   Clover
  Platform Vendor                                 Mesa
  Platform Version                                OpenCL 1.1 Mesa 19.2.8
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd
  Platform Extensions function suffix             MESA

  Platform Name                                   Clover
Number of devices                                 1
  Device Name                                     AMD POLARIS11 (DRM 3.35.0, 5.4.0-21-generic, LLVM 9.0.0)
  Device Vendor                                   AMD
  Device Vendor ID                                0x1002
  Device Version                                  OpenCL 1.1 Mesa 19.2.8
  Driver Version                                  19.2.8
  Device OpenCL C Version                         OpenCL C 1.1 
  Device Type                                     GPU
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Max compute units                               16
  Max clock frequency                             1326MHz
  Max work item dimensions                        3
  Max work item sizes                             256x256x256
  Max work group size                             256
  Preferred work group size multiple              64
  Preferred / native vector sizes                 
    char                                                16 / 16      
    short                                                8 / 8       
    int                                                  4 / 4       
    long                                                 2 / 2       
    half                                                 8 / 8        (cl_khr_fp16)
    float                                                4 / 4       
    double                                               2 / 2        (cl_khr_fp64)
  Half-precision Floating-point support           (cl_khr_fp16)
    Denormals                                     No
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
  Single-precision Floating-point support         (core)
    Denormals                                     No
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Address bits                                    64, Little-Endian
  Global memory size                              4294967296 (4GiB)
  Error Correction support                        No
  Max memory allocation                           3435973836 (3.2GiB)
  Unified memory for Host and Device              No
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       32768 bits (4096 bytes)
  Global Memory cache type                        None
  Image support                                   No
  Local memory type                               Local
  Local memory size                               32768 (32KiB)
  Max number of constant args                     16
  Max constant buffer size                        2147483647 (2GiB)
  Max size of kernel argument                     1024
  Queue properties                                
    Out-of-order execution                        No
    Profiling                                     Yes
  Profiling timer resolution                      0ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
  Device Extensions                               cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp64 cl_khr_fp16

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  Clover
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [MESA]
  clCreateContext(NULL, ...) [default]            Success [MESA]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  Success (1)
    Platform Name                                 Clover
    Device Name                                   AMD POLARIS11 (DRM 3.35.0, 5.4.0-21-generic, LLVM 9.0.0)
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  Success (1)
    Platform Name                                 Clover
    Device Name                                   AMD POLARIS11 (DRM 3.35.0, 5.4.0-21-generic, LLVM 9.0.0)
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (1)
    Platform Name                                 Clover
    Device Name                                   AMD POLARIS11 (DRM 3.35.0, 5.4.0-21-generic, LLVM 9.0.0)

ICD loader properties
  ICD loader Name                                 OpenCL ICD Loader
  ICD loader Vendor                               OCL Icd free software
  ICD loader Version                              2.2.11
  ICD loader Profile                              OpenCL 2.1

So that’s OpenCL 1.1 by the looks…

2 Likes

Dear, I’m trying to run my OpenCL snap app (installed using --devmode) with the proprietary AMDGPU-Pro driver.

  1. I expected it to work immediately (--devmode: a similar way to traditional .deb packages). But it doesn’t.
  2. I can make it work. Just running:
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/var/lib/snapd/hostfs/opt/amdgpu-pro/lib/x86_64-linux-gnu/:/var/lib/snapd/hostfs/opt/amdgpu/lib/x86_64-linux-gnu/
  3. So, I created inside snapcraft.yaml the workaround seen below (but this is not allowed):
layout:
  /opt/amdgpu-pro:
    bind: /var/lib/snapd/hostfs/opt/amdgpu-pro
  /opt/amdgpu:
    bind: /var/lib/snapd/hostfs/opt/amdgpu

Do you guys have any advice? I mean, a regular user can understand that to access OpenCL using proprietary drivers he/she needs to skip the sandboxing for a while. But even that is not working.

Just to be clear, HPC using OpenCL requires proprietary drivers. I saw you are working on NVIDIA support. AMD does not deserve much attention, but, a workaround is a possiblility

Please, move to a new topic if that makes sense.