Classic Confinement Request for "gigi2"

ttbek · November 15, 2018, 3:03pm

We are attempting to publish gigi2, main source code here: https://cse-git.qcri.org/eullah/GIGI2 ,snap source code here: https://cse-git.qcri.org/kkunji/GIGI2_Snap , and snap here: https://snapcraft.io/gigi2 , a software for genotype imputation. As it is a scientific software we cannot be sure of where it will be used and restrictions on where it writes its output seem unreasonable. It may often need to write to NFS, Lustre, or other shared file systems as the most common use will likely be on large clusters. Writing to a user’s home directory will often not be feasible as home is often not a terribly large partition and the outputs can be very large. Yes, there is an option for running on external devices, but asking users to learn the plug/socket system and perform those additional commands is on par with just asking them to compile it themselves… in which case there is little point in packaging it as users won’t gain in ease of use. It’s bad enough that we’ll need to ask them to use the --classic option while installing if it is granted. Additionally, even with the plugs and sockets I don’t think they would be able to write everywhere they may want to, unless I’m misunderstanding how this works… As I understand it there are ways to allow writing in home or on external media but I don’t think there is an interface for general user writeable locations. If I’m wrong about that then I could use a bit of guidance on how that works.

For author vetting purposes, I am the maintainer of a few R CRAN packages: ftp://cran.r-project.org/pub/R/web/checks/check_results_kkunji_at_hbku.edu.qa.html , and a contributing author (but not maintainer) to several more. Yeah… I know I need to get around to a few updates there… but that rbind deprecation isn’t from my package but from a package mine depends on T.T I am a member of the Data Analytics group at Qatar Computing Research Institute, a part of Hamad Bin Khalifa University, member of Qatar Foundation.

I do not, and should not, control where a gigi2 user is able to write their output. In my experience that has been very much the domain of file system and domain controller permissions and it’s a bit strange this way. Even for an isolated docker container I can mount an almost arbitrary volume… Is there a simple one liner I could give users to let them write where they want while using strict confinement? Scientists can be pretty irate when the more standard mechanisms get in their way, not being able to make the snap process painless on their end probably means abandoning bothering with a snap package. I’m open to doing some more work on my side to make their side smooth though.

I could put it as edge or beta and ask them to use dev mode potentially, but that’s not really representing it properly.

ttbek · November 15, 2018, 4:19pm

To clarify a bit more, the original GIGI was coded by Charles Y.K. Cheung http://faculty.washington.edu/wijsman/progdists/gigi/software/GIGI/GIGI.html Ehsan Eullah (mostly) and myself (a bit) did a pretty heavy rewrite including making it threaded and introducing more efficient data structures. We have a publication under submission for GIGI2. I was also the author of GIGI-Quick (https://academic.oup.com/bioinformatics/article-abstract/34/9/1591/4756093), which had some similar goals (speeding up imputation with GIGI) but took a less elegant shell scripting approach, but is one that could actually still be used in conjunction with GIGI2 to run across multiple nodes. https://academic.oup.com/bioinformatics/article-abstract/34/9/1591/4756093?redirectedFrom=fulltext So in this case the source (myself and Ehsan Ullah) are as well vetted as we can be on our end, we aren’t asking to package some random 3rd party code as classic. You can find us on the Data Analytics website for QCRI (though our titles are out of date). http://da.qcri.org/teams/

jdstrand · November 16, 2018, 7:48pm

This would defeat the purpose of strict confinement. The design of snaps is that they are isolated from each other and the system. You mentioned that with docker you can bind mount things into the snap-- you can do that with snaps too if desired…

@niemeyer and @pedronis - AIUI (@ttbek, please correct me), the publisher is asking for classic in anticipation that the snap’s users won’t want or understand confinement. Traditionally we’ve not granted classic for this type of request. Can one/or both of you weigh in?

ttbek · November 17, 2018, 12:36pm

@jdstrand I think they will understand it if they’re willing to put in the time (… they’re mostly well educated statistical geneticists after all), but I doubt that they want it. My goal with packaging it was to make the process easier rather than harder for them though and have it available in a more OS agnostic repository (which is why I didn’t go for more of a .deb or .rpm, etc…) that is properly tracked by a package manager.

Can you explain more about how to bind then, from the user’s perspective, if I can give them an easy line or two to do it then that’s probably good enough for me. Reading the docs I’m not really sure how to do this myself. I see this… Layouts: re-mapping snap directories but I’m not sure if it is finished and it seems that layouts are pre-specified locations from the developers side, where what I’m looking for is more of the opposite, arbitrary locations chosen by the user. When I say arbitrary, I just mean wherever their user could normally write to. The Docker way …works, but isn’t exactly elegant: https://hub.docker.com/r/kkunji/gigi2/ The reason we bother with it is in case it needs to be run on a larger scale, with Kubernetes on AWS or Azure for instance.

… oh wow, it works to bind mount some other random directory into home so long as you have a normal home. That should cover most use cases for users on their own (single user machines) though it is sort of unnecessarily cumbersome.

snap install opera
mkdir -p /home/username/mount_test
sudo bind -o /some/random/location /home/username/mount_test
Start opera and save stuff to /home/username/mount_test, exit opera
umount /home/username/mount_test
And the saved stuff is still at /some/random/location

Still we have corner cases: what if I need to write to another user’s subdirectory in an NFS based home for which they have modified file permissions so that I could normally write there (e.g. 775 and we’re in the same group)? This is a shared system, so we don’t have root, which is still fine for installing your own snaps, but bind-mounts require root. There are per user quotas on home size, which is the motivation for writing to a colleague’s home.

I see Snaps and NFS /home … but that seems like an outline for internal development, it isn’t at all clear to me what I can make use of now along those lines.

I’m not sure how it would defeat the purpose of strict confinement to allow a user to specifically override where the snap can write to for a single instance of the snap. Maybe there could be some kind of user override paradigm that prompts for permission from the user per use, ala like what sudo does when a user tries to run something that needs root. At least to me, it makes sense that a user should be able to allow an app to do anything the user is allowed to do. Not that that the app should be able to go off and do it without confirmation by default. Is there not a way to generalize something like the home or removable-media + network interfaces and to prompt asking for permission instead of just failing? I guess I don’t think of an application doing as it was told and writing to somewhere, e.g. /var/log when /var/log was specified for a log file output as “interfering” with the host system. If it did that without having been explicitly told to by the user, then I would agree with you.

Another case, we have a machine where I have root, but a number of other permissions are managed by a Domain Controller that I do not administer. I could install snapd, and then another user could install the GIGI2 snap… but our homes are on external NFS, e.g. /export/abc/ABCD/home/username I’m not sure how they could write data to their home in this case?

I know that NFS homes aren’t common in general, but they are very common on large shared clusters for scientific research that most users of my program likely do make use of.

Maybe some of these interfaces do something I don’t fully understand, frankly the documentation is pretty sparse. E.g.:

These are dangerously close to being tautologies, “storage-framework-service allows operating as, or interacting with, the Storage Framework.” … yeah… so what IS the “Storage Framework.”
“fuse-support allows access to FUSE filesystems.” Great, so what exactly do you include under “FUSE filesystems” Are all FUSE mounts supported? Trying to google for their relation with snap is also pretty useless. All I find while looking for the FUSE one is a few forum posts about how it perhaps does not work for SSHFS. The only way to find the actual information seems to be to search the in the forum Fuse-support plug and mount points in $SNAP_COMMON folder And it seems like there are plenty of caveats to find.

If FUSE still works the same way, then it sounded like an option for me, but for the same reasons as here: Request for classic confinement: parsec It ultimately doesn’t make sense for me, needs root and no unmounting, etc… In that case you guys did grant classic. That case was not identical of course, if I read correctly, then the source of the data they are exposing is the crucial difference, I think they were looking to mount data from inside the snap to make it available whereas I’m looking for the opposite, to mount a location from outside the snap inside it to make it writeable.

I may be making some arguments from ignorance here, I apologize if I am.

ttbek · November 17, 2018, 1:32pm

@jdstrand Again, not that they won’t understand it, but they won’t like it, and it will be a much more serious inconvenience for them than in this case which was approved: Permission denied while attaching files (may require classic confinement)

I’ll quote you,

And yes, at least 99% of our users would expect to be able to write output to any normally user writeable location, i.e. any location to which the standard Unix file permissions would indicate that they could write to. So if the bar is user expectations…

If you want it to be even more analogous to uploads, well, they’re not going to like having to move their parameter and input files around to certain locations for the program to read them either. Sometimes they’re even files that they shouldn’t be moving around, some of this pedigree genome data is considered very sensitive and confidential. They may not be allowed to copy it over to their home, if it even fits there. And again, on these shared systems they may not have root.

If they don’t get that and there isn’t a way for them to do it with another command or two, then I have little doubt that they compile from source instead or use the webserver we provide (at our expense T.T lol): https://imputation.qcri.org/index.html

I don’t believe there is a reasonable way to use xdg-desktop-portal in my case, it is basically a bare C++ command line app (please correct me if that is wrong and the portal thing supports my use case). The users of this snap would often be running on shared HPC systems without any GUI to speak of.

The other expectation in that thread was expressed by Popeye:

We are also the upstream source in our case.

I guess the criteria may have changed…, but that was just in June of this year, a handful of months ago.

jdstrand · November 19, 2018, 10:48pm

Sharing arbitrary files in various user’s home directories, wherever they might live is not currently supported by snapd itself. Applications could utilize Portals from the desktop interface to access arbitrary files, but the application must support that. If your application uses a gtk3/qt5 file dialog, it shouldn’t be too much effort to make this work. Longer term we plan to have a prompting mechanism via snapd (a PoC is in the works but a production feature won’t land in the next couple/few snapd releases).

jdstrand · November 19, 2018, 11:02pm

For clarity, I wasn’t asking for you to defend your position per se, I just wanted you to correct me if I misrepresented you. Your additional comments do add color on how the application is expected to function in the field however, which is useful. @niemeyer and/or @pedronis can comment if this is a candidate for classic.

ttbek · November 20, 2018, 11:48am

Hmm, how about this, I’ll change the stable version to strict confinement and add the home and external media plugs. Then I will release a devmode confinement version on the beta channel and I will instruct users to use one or the other based on if they need to write to other places. I think it will still be less complicated and finicky than all the commands for the docker version

Does that seem reasonable, or is there an objection to asking more or less general users to use devmode this way?

Let me know, I need to take care of some other things but I will be moving ahead and putting this issue behind me after the next 8-10 days. Whether that means doing the above, or asking users to install from a snap distributed only from me in devmode rather than through the store, etc… The store perhaps gives more user confidence for someone that just stumbles upon it, but most of our users will know of us through our research papers and won’t really object to an “unsafe” snap as it is from a relatively trusted source. They’ll run it even if I hand them an executable, the concerns on my side are more with making sure it works properly and smoothly on every *nix+Windows (within reason) they may be using than that they would be afraid to run my code.

About the portals, yeah, my app is totally cmdline driven. Potentially we could add a simple gtk or qt interface in the future, but that is extremely low priority work for us and wouldn’t make a difference for many of our users as these systems often don’t have a desktop environment installed. What mechanism does the portal actually use to do this? Nevermind, I found some DBus API documentation. Ok, so a cmdline frontend should also be possible to make. What isn’t quite clear to me is if the exposed file could also be written to or not. Also, I’m not sure that sending that much data over DBus is a good idea (at least some people on SO think it isn’t https://stackoverflow.com/questions/6220704/passing-a-large-data-structure-over-dbus). I mean, you’re not just passing the handle, but the entire file, right? No, I’m wrong on that, "The selected files will be made accessible to the application via the document portal, and the returned URI will point into the document portal fuse filesystem in /run/user/$UID/doc/. " Does that mean the file gets copied to /run/user/$UID/doc/ ? Because that’s fine for selecting small input files, but that would totally defeat the point of being able to choose a different location for our large output files if they had to be written there first and are moved transparently. That would almost certainly give some serious problems.

jdstrand · November 20, 2018, 3:57pm

Well, it wouldn’t be considered normal usage but it is your snap so you can do what you like. If we go this route, you don’t need a separate snap in the beta channel, you just have your snap in strict mode and users can install with ‘sudo snap install --devmode your-snap’ if they want devmode. Is this the route you plan to take?

@jamesh - would you mind commenting on the specifics of how the document portal makes the file available to the snap?

jamesh · November 21, 2018, 10:31am

As they exist right now, I don’t think the document portal will help this app. The primary means for granting access to files is via the out of process graphical file chooser, which doesn’t really mesh well with command line tools.

With that said, it would be possible to use the document portal to build a solution for this kind of app. It’d have to look something like this:

have the snap app expose the structure of its command line arguments: Which ones represent file names? Do they need read or write permission?
have snap run process the command line arguments according to those rules.
ask the document portal to grant access to those files, and rewrite the command line arguments with the appropriate doc portal paths.
somehow revoke access to the files again when the application exits (we don’t currently have a way to do this, since there is no supervisor process for the confined app).

This is all quite hand-wavey, so doesn’t really constitute a solution today. It should certainly be possible though.

niemeyer · November 21, 2018, 1:14pm

Unfortunately right now we don’t have support for reads and writes in arbitrary locations while respecting the confinement needs. We also don’t use that alone as rationale for having classic, as in that case most things would be unconfined, which removes an important benefit and goal of this system. As alternatives, we currently offer interfaces for access into the home directory, and also offer a removable-media interface which allow writing into “/media” and “/mnt”, which are the traditional locations for external mountpoints. Both of these are clear and explicit for someone using the given snap, and the permissions may be rejected by the user if desired.

Would that be reasonable for the time being? We’re already researching the ability to more explicitly assign certain locations towards snaps, but that’s not yet ready.

The --devmode flag is oriented towards development. You can encourage people to use it of course, but the system will assume the goal of development, and will both warn users about that and change the behavior of the system towards that goal too.

ttbek · November 25, 2018, 3:55pm

Hey guys, I thought that I read that devmode was only supported for beta and edge channels, but as you say it does seem to work on the main channel just fine. I guess that is only when actually specifying devmode confinement but you can install strict confinement apps in devmode anyway. https://askubuntu.com/questions/783945/what-is-devmode-for-snaps

I will remove the classic request (Edit: Ah, you guys closed it for me ^_^) and add the home and removable-media interfaces, and then explain how to use those in our readme and that they should install using devmode if they need to write to other places.

I think we’re just about good to go. Thanks to everyone for their patience in taking the time to hash all of this out.

One (I hope final) other question though, as you say --devmode will behave a bit differently (I assume mainly in things like logging some additional debug info), would you anticipate that it would be in any way have a significant detrimental effect on the performance of the application? I mean, when a C++ app opens a file for writing somewhere a strict app couldn’t, that will be logged once right? Not something like once for every line written or every time the buffer is flushed to disk, right?

Finally “Developers should not encourage end-users to install snaps with the ‘–devmode’ switch,” is written here: https://snapcraft.io/blog/demystifying-snap-confinement … I’ll be doing pretty much exactly that, but hopefully we’ll have a better way in the future. Looking forward to it. I’ll keep an eye out for developments regarding cmd based portals etc…