We are trying to release our slurm snap to the snap store as a classically confined snap. Both slurm and its supporting components use the setsid command in some way or another (which we have deemed isn’t supported in any way by strict confinement.) We are trying to move forward serving our snap from the snap store, and this is currently a blocker for us. Can we either have a conversation around this, and/or take some action to do what we need to do to get this snap approved for release?
Ah at first I was confused since I thought you were referring about access to the
setsid system call - which is already allowed by default for all snaps as it is in the default policy: https://github.com/snapcore/snapd/blob/2.44.5/interfaces/seccomp/template.go#L407
However if you require an external command like
/usr/bin/setsid you can simply bundle this inside your snap using the
stage-packages directive (and in this case specifying the
util-linux package as this provides
/usr/bin/setsid) - since strictly confined snaps cannot execute binaries from the host machine.
This is exactly what I needed to do. Thanks @alexmurray
@alexmurray we have followed up on this and it got us a bit further, but now we have an issue where the code we are snapping calls setegid and throws a chain of errors for us.
Trying to execute the
srun command in our snap I get the following:
$ slurm.srun -pdebug -n1 -l hostname -vvvvvv
srun: error: task 0 launch failed: Slurmd could not set UID or GID
Looking in the slurm logs I see:
[2020-05-06T03:55:39.809] [2.0] error: setegid: Operation not permitted
Snappy debug shows:
= Seccomp =
Time: May 6 03:33:30
Log: auid=4294967295 uid=0 gid=0 ses=4294967295 pid=4452 comm="slurmstepd" exe="/snap/slurm/x1/sbin/slurmstepd" sig=0 arch=c000003e 119(setresgid) compat=0 ip=0x7fd2d6d5bd9d code=0x50000
* adjust program to not use 'setresgid' until per-snap user/groups are supported (https://launchpad.net/bugs/1446748)
Looking further into the same
mgr.c I see this which leads me to believe that executing as the
snap_daemon user might be a way around this. Am I on the right track here in thinking that running the process as the
snap_daemon user could be a path forward?
Do you have advice on how might be able to get through this?
FYI, you may either use system-usernames to drop to the
snap_daemon user, patch to not drop or LD_PRELOAD to make it a no-op.
@jdstrand Thanks for the feedback. If you don’t mind, how do I exec my commands and daemons as the
snap_daemon user? Will simply setting
Force my processes to execute as
As a follow-up, this is how we are currently implementing it - https://github.com/omnivector-solutions/snap-slurm/blob/strict_testing/snap/snapcraft.yaml#L14
However, this does not appear to provide us with the desired outcome; our daemons are still always running and executing as root.
This allows the snap to drop privileges to the
snap_daemon user/group but it does not actually force that - so your application will still be run as a root daemon but it is now allowed to transition to the
snap_daemon user - so the app, or perhaps some wrapper script, would still need to
setgroups()/setgid()/setuid() etc to drop privileges from root to
snap_daemon - see https://snapcraft.io/docs/system-usernames for more info and some discussion about securely dropping privileges.
As a simple example I found @sergiusens created https://github.com/sergiusens/user-daemon which might be useful to look at (although ignore the comment about requiring snapd from edge since
snap_daemon has been supported since snapd 2.41 which is stable)
As per SLURM auto-connect for network-control [Was: SLURM Snap (transfer ownership)] this appears to have changed to a request for auto-connect of
Request for classic confinement continued here: Request for Classic confinement: Slurm