Request for Classic confinement: Slurm

jamesbeedy · July 16, 2020, 3:27pm

I imagine it is normally supposed to take over a machine with no workloads not related to it usually running

Correct.
It wouldn’t run quietly in a corner taking control only from time to time

Correct.

Slurm is responsible for the accounting and scheduling of resources needed to process queued jobs (hpc workloads) and would be using nearly all system resources at all times.

(EDIT) Correction/Expansion:
slurmctld is responsible for the scheduling and accounting of machine resource for nodes that belong to the slurm cluster. It does so by communicating to the slurmd units to acquire their resource usage, and with the slurmdbd to commit resource usage and cluster metrics. slurmdbd is responsible for transacting with the database. slurmd is the compute daemon that executes the slurmstepd process which is the program that is responsible for preforming the work to get the job done/running the computation. In a slurm deployment, each component; slurmd, slurmctld, slurmdbd runs on its own server. There are generally one or two slurmctld components (an active controller and a backup controller), one or two slurmdbd (active and backup), and N slurmd nodes. The slurmd and subsequently the slurmstepd process are what use the resources of the node to carry out the computation.
Hopefully this helps clear things up.
(END EDIT)

In particular except for developer machines we wouldn’t expect this to run on a desktop?
Correct.
We don’t expect casual users to install this at all?
Correct. Unless they want to run a real hpc workload on their local box in dev mode (what is achieved by setting the snap.mode=all.

The reason it can’t be confined is because the process needs to run as the effective uid of the user running the workload.

Imagine you have 1000s of users trying to run workloads on a large cluster where all of the users are members of an active directory realm. User home, scratch space, long term storage are all supplied as mounted network filesystems to all nodes in the cluster. In this way an active directory user can have private and shared file system space on every node in the cluster. The same users are running hpc workloads that need access to their user space in the active directory controlled filesystem(s). For this to be possible, the compute daemon process of slurm, slurmstepd need to execute under the effective uid of the active directory user in order for slurm to account for the resources used by the job and more importantly, so that the slurmstepd can access files owned by the active directory user in network filesystems.

Slurm runs seteuid() and setegid() to drop the slurmsted compute process privilege to the effective uid of the user executing the process so that the process can access resources owned by the user.

An except from slurm code that drops privs

jdstrand · July 24, 2020, 6:52pm

Right, but that is a current limitation of the sandbox. It is plausible that it could be extended in ways that would allow your snap to work under strict confinement. One of the considerations for classic that we ask ourselves is if something could ever be made classic, and @pedronis and I believe that to be ‘yes’.

@jamesbeedy - thank you again for the additional information. I believe this is quite close to a decision now.

jdstrand · July 24, 2020, 6:54pm

I believe this is a good summary. As @jamesbeedy mentioned, the answers to all of these is ‘correct’ (yes). Based on this distillation, IME, this is a new supported use case for classic. If you agree, can you agree and we can proceed with the request?

pedronis · July 29, 2020, 6:54pm

yes, it seems a new supported case (not sure it will be easy to distill it for re-application though) with a possible path to confinement at some point, so I agree.

alexmurray · July 30, 2020, 12:33am

The requirements for classic are understood. @advocacy, can you please perform the vetting?

popey · July 30, 2020, 9:05am

Vetting done. +1 from advocacy.

alexmurray · July 30, 2020, 11:28am

Granting use of classic. This is now live.

jamesbeedy · July 31, 2020, 3:58pm

@jdstrand @popey @pedronis @alexmurray @egeeirl thank you!

jdstrand · July 31, 2020, 5:56pm

Thanks, I’ll write something up for you to review.