Request for Classic confinement: Slurm

@egeeirl and @jamesbeedy - sorry for the delay, this is a new use case that needed to be investigated and thought through.

@pedronis - this seems to be a new use case for classic. Some discussion happened in: Can a confined Snap run as a different uid and or guid?.

IME, slurm, is an orchestration tool for HPC. AIUI, slurm can and does utilize the snapd_daemon for certain actions, but in certain environments the slurmd process itself needs to run (arbitrary) commands as the effective uid/gid user in order access resources for the user (@jamesbeedy referred to this as the ‘active directory user’ which seem to tied to slurm’s concept of ‘partitions’ (see the overview url)). Other use cases are to run commands as other users for accounting purposes (AIUI, the current design of slurm is such that it performs resource accounting by tracking the uid that the process ran as as opposed to something like having users login in with an account and executing all commands as the same user and performing tracking via use of the account).

The slurm website states “Slurm is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters”. So, like flowbot, this is similar to juju. The closest use case in our processes for classic is ‘management snaps’ which we’ve identified as an unsupported use case.

In my limited understanding of slurm, we could perhaps require that slurm be modified to run anything that should be non-root as snap_daemon with slurm then being further modified to perform accounting with this in mind, but IME that would change how users expect to use slurm (ie, they’d have to ‘login’ in some manner). Even if that change was made, slurm’s functionality wrt partitions would be limited.

While orchestration tools like slurm (and juju (and potentially the recent flowbot request) do ‘manage’ systems, these systems are not for managing fleets of devices/laptops/servers/etc which have login users and are instead about scale out computing on systems primarily without login users. In that light, I think we should consider the ‘orchestration snaps’ use case as something different from ‘management snaps’.

@pedronis - thoughts?