Request for Classic confinement: djlbench

I tried to add personal-files plugin, I still run into exception:

java.lang.UnsatisfiedLinkError: /home/ubuntu/.djl.ai/pytorch/1.8.1-cpu-linux-x86_64/0.12.0-cpu-libdjl_torch.so: /home/ubuntu/.djl.ai/pytorch/1.8.1-cpu-linux-x86_64/0.12.0-cpu-libdjl_torch.so: failed to map segment from shared object
	at java.lang.ClassLoader$NativeLibrary.load0(Native Method) ~[?:?]
	at java.lang.ClassLoader$NativeLibrary.load(ClassLoader.java:2442) ~[?:?]
	at java.lang.ClassLoader$NativeLibrary.loadLibrary(ClassLoader.java:2498) ~[?:?]
	at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:2694) ~[?:?]
	at java.lang.ClassLoader.loadLibrary(ClassLoader.java:2627) ~[?:?]
	at java.lang.Runtime.load0(Runtime.java:768) ~[?:?]
	at java.lang.System.load(System.java:1837) ~[?:?]

Here is the output from snappy-debug:

= AppArmor =
Time: Jun 29 20:43:24
Log: apparmor="DENIED" operation="open" profile="snap.djlbench.djlbench" name="/proc/11648/mountinfo" pid=11648 comm="java" requested_mask="r" denied_mask="r" fsuid=1000 ouid=1000
File: /proc/11648/mountinfo (read)
Suggestions:
* adjust program to not access '@{PROC}/@{pid}/mountinfo'
* add 'mount-observe' to 'plugs'

= AppArmor =
Time: Jun 29 20:43:24
Log: apparmor="DENIED" operation="open" profile="snap.djlbench.djlbench" name="/proc/11648/coredump_filter" pid=11648 comm="java" requested_mask="wr" denied_mask="wr" fsuid=1000 ouid=1000
File: /proc/11648/coredump_filter (write)
Suggestion:
* adjust program to not access '@{PROC}/@{pid}/coredump_filter'

= AppArmor =
Time: Jun 29 20:43:25
Log: apparmor="DENIED" operation="open" profile="snap.djlbench.djlbench" name="/proc/11648/mountinfo" pid=11648 comm="java" requested_mask="r" denied_mask="r" fsuid=1000 ouid=1000
File: /proc/11648/mountinfo (read)
Suggestions:
* adjust program to not access '@{PROC}/@{pid}/mountinfo'
* add 'mount-observe' to 'plugs'

= AppArmor =
Time: Jun 29 20:43:25
Log: apparmor="DENIED" operation="file_mmap" profile="snap.djlbench.djlbench" name="/home/ubuntu/.cache/JNA/temp/jna15433788716430075344.tmp" pid=11648 comm="java" requested_mask="m" denied_mask="m" fsuid=1000 ouid=1000
File: /home/ubuntu/.cache/JNA/temp/jna15433788716430075344.tmp (mmap)
Suggestion:
* add 'personal-files (see https://forum.snapcraft.io/t/the-personal-files-interface for acceptance criteria)' to 'plugs'

= AppArmor =
Time: Jun 29 20:43:25
Log: apparmor="DENIED" operation="file_mmap" profile="snap.djlbench.djlbench" name="/home/ubuntu/.djl.ai/pytorch/1.8.1-cpu-linux-x86_64/0.12.0-cpu-libdjl_torch.so" pid=11648 comm="java" requested_mask="m" denied_mask="m" fsuid=1000 ouid=1000
File: /home/ubuntu/.djl.ai/pytorch/1.8.1-cpu-linux-x86_64/0.12.0-cpu-libdjl_torch.so (mmap)
Suggestion:
* add 'personal-files (see https://forum.snapcraft.io/t/the-personal-files-interface for acceptance criteria)' to 'plugs'

We don’t bundle CUDA library, we rely on system installed CUDA.
We detect CUDA at runtime by trying to load libcudart.so file from system.

If we are using strict mode, doesn’t that means the detection will fail silently?

Strict snaps live in a mount namespace, they don’t usually see the hosts system libraries and it’s generally an antipattern to rely on them, so looking for cudart in the normal locations will certainly fail, whether it’s silent or not I wouldn’t know, that depends on your own code.

You could possibly use system-backup or system-files interfaces to try and access the host libraries explicitly, but by shipping cudart.so yourself, you end up not having to rely on the host libraries at all. Assuming the user already has NVidia drivers installed that support the cudart.so you ship, then they wouldn’t have to install CUDA system wide at all.

We won’t be able to bundle the CUDA library in our tool.

  1. The CUDA and cudnn library are huge and our library jar is only a few mb
  2. We support wide range of CUDA version, it doesn’t make sense to bundle all versions
  3. CUDA is optional in our tool, if user only want to benchmark on CPU, they don’t need CUDA at all.
  4. We don’t know which version of CUDA user want to run benchmark. We assume user installed version is the one. This way we don’t need prompt user to select a CUDA version.

I can agree with the size complaints, even with compression snap provides, CUDA does bloat downloads. On the other hand, it’s Canonical paying the bill ;). There were previous discussions about resolving this by the use of content snaps to help deduplication but as far as I’m aware this hasn’t gotten anywhere yet.

What stands out to me most there is CuDNN, as far as I know of NVidia’s current licensing (not a lawyer), CuDNN cannot be redistributed which ruins the ability to ship these libraries inside the snap. Legally, the user needs to acquire this themselves.

I would still suggest looking into the system-files/system-backup interfaces. Since these can expose the host system, you could potentially use the host CUDA libs in a strict snap. You’d need to add them in $LD_LIBRARY_PATH (and also keep in mind the gnome-extension, which will handle some other environment issues, but you could replace the gnome-extension with something smaller too if preferred).

So my hopefully helpful opinion to the reviewers here (keep in mind I’m not a reviewer) is that the requirement for CuDNN in particular is a pretty severe limitation in the strict model, and at a minimum to be functional for users who’d want that functionality, system-backup or system-files would be needed.

Thanks for the advice.

The more I look into this, the more I believe we have to use classic mode:

  1. We are a wrapper on top of other DeepLearning frameworks: PyTorch, TensorFlow, MXNet …, it’s would very hard for us to understand how each framework access the system. So it very hard to list all the path of library the tool will need for different linux distribution.
  2. We allows user to load custom shared library at runtime (custom operators for different hardware accelerator, like AWS Infererentia chip). We don’t really know where those library will be installed.
  3. As a performance benchmark tool, we monitor CPU and memory utilization, not clear to me if that will hit any permission issue.

@alexmurray

In # Process for reviewing classic confinement snaps document listed one of criteria that might requires classic:

running arbitrary command (esp if user-configurable such as a developer tool to organize dev environments)

We have many use cases that allows user to load arbitrary .so file:

  1. Load external libtorch.so/libmxnet.so/libtensorflow.so etc to benchmark user custom build of deeplearning framework.
  2. Load external hardware accelerator driver, like AWS Inferencia Chip, or AWS Elastic Inference Accelerator.
  3. Load extra shared library to support custom operators that being used by the model. Both PyTorch and MXNet support custom operators.

Would you please approve classic request based on above use case?

@pedronis, can you please analyze this request?

Thanks!

On further review, I think djlbench more closely fits within the debug tools category for classic confinement - ie. it is used by developers to benchmark their machine learning models which is pretty close to the activity of debugging. Also it does require access to arbitrary files / libraries already on the system. As such, I think it meets the criteria for classic confinement.

@advocacy can you please perform publisher vetting?

I agree with this analysis. We do want in the future to work to have ways for snaps to access transparently this kind of libraries from the host or from snaps, at which point the situation could be reconsidered.

@deepjavalibrary Is there an official domain/page for the library I could check please?

Here is the information

Our website: https://djl.ai
Out main github repo: https://github.com/deepjavalibrary/djl
The folder for djl-bench: https://github.com/deepjavalibrary/djl/tree/master/extensions/benchmark

Thanks for the info, can I also ask you to pm me the official contact email for djl.ai please?

Our email: djl-dev@amazon.com

+1 from me, I’ve verified the publisher.

Thanks Igor, this is now live.

@deepjavalibrary Note, existing users of the strictly confined version of your snap will not automatically update to the classicly confined version - they would need to manually refresh, passing the --classic flag to confirm they are ok with the snap being granted this new permission.

@alexmurray
Thanks for approving this.

I try to upload a new version, and it it got rejected.

What should I do next?

This was rejected due to the combination of classic confinement and plugs - classic snaps can access everything and so there is no need to declare any plugs in your snap now - please remove these and it should pass the automated review.

It’s working now. Thanks a lot.