Auto-connecting the process-control interface for the 0ad snap

This is a request to auto-connect the process-control interface for the 0ad snap. Without that connection, the game won’t start, so it’s a must-have (the game makes use of setpriority and sched_setaffinity).

I am not upstream for 0ad, but my attempt to get them interested in the snap hasn’t been met with much interest so far. I am willing to transfer ownership of the snap should that change though.

I am actively working on that snap package (just got the map editor working fully confined) and I intend to publish new versions when upstream does new releases.

Please let me know if more information is required to make that request acceptable.

2 Likes

It sounds fine to me. @jdstrand?

process-control allows the snap to control the processes of everything on the system and this seems odd for a game to require to function. Does 0ad still require ‘process-control’ with the newer seccomp argument filtering that allows sched_setaffinity(0, ...) and setpriority(PRIO_PROCESS, 0, 0-19)? If it still does need process-control, can you provide details on how these system calls are being used and/or adjust the program to work within the existing policy?

2 Likes

Thanks for the review Jamie. Looks like sched_setaffinity won’t require process-control with the newer seccomp argument filtering. However setpriority remains a blocker.

I’ve dived into the code to try and understand what in the game is trying to give a high priority to a thread, and it appears it’s the JS engine (spidermonkey) at initialization time, when it creates n helper threads. It calls PR_CreateThread (from libnspr4) and this is what appears to call setpriority with a negative priority, although I cannot see immediately where (the call has PR_PRIORITY_NORMAL as an argument, which should translate to 0).

I’ll continue investigating to see whether the game can be patched to avoid requesting a negative priority, and if so if that doesn’t affect performance too visibly. If anyone has experience with spidermonkey/libnspr4, help welcome!

I suspect it isn’t going to affect anything at all since only root can set the priority to negative numbers. Note that @tyhicks is pretty close to landing the EPERM instead of KILL code for seccomp which may make this all just go away (the application might erroneously try to setpriority to a negative value, but it won’t be killed).

2 Likes

I identified the call to setpriority that fails: https://hg.mozilla.org/projects/nspr/file/tip/pr/src/pthreads/ptthread.c#l146.

The problem is not with the value of the prio parameter, it’s the who parameter (≠ 0).

If I understand correctly your comment about the work @tyhicks is doing, the call to setpriority will fail but the code path will continue to be executed? That would solve my issue I think. When is this scheduled to land? Where can I track it?

Looking at the code, they don’t check the return code of setpriority so it will just continue, yes.

Yeah, when which is PRIO_PROCESS then who is any pid when it isn’t 0. Allowing something other than 0 here would mean the application could change the priority for processes that aren’t the application’s. The application may be able to be modified to change where the setpriority is called so that who can be 0. In lieu of that, when the seccomp EPERM work lands, the game would still work, but you could still plugs process-control if you wanted and have user’s connect it as desired.

-1 for granting the auto-connection. The game shouldn’t need to modify the priority (and a lot more since process-control grants more than that) of all processes. The application could be modified to use setpriority in a sandbox-compatible way and soon the unmodifed application will no longer crash on start because of upcoming changes to seccomp.

Agreed. I’m still curious: where can I track those upcoming changes to seccomp? Is there a bug report/blueprint I can subscribe to? Are they going to be SRU’ed?

These changes are kind of difficult to track because they’re being discussed in various threads on LKML, a libseccomp PR on github, etc. I assume that you just want to know when they land via an SRU and this bug is the best place for that:

https://launchpad.net/bugs/1567597

1 Like

@jdstrand Can we unblock this interface given the context and the potential for abuse at stake? It’s a game, so won’t be used in unrelated production servers which might be affected by potential priority shifts. Worst case still sounds much better than not using a confined snap.

Once other issues are sorted and we can take that interface away, we can rediscuss.

1 Like

Thanks Tyler, I’ll be tracking that bug.

@niemeyer - The snap in the store is currently in strict mode. I gave suggestions that should be easily doable to modify the code to work without process-control. Considering that there are no blockers to that work, the requester agreed with me, the snap works now when users connect the interface, and the situation will only get better soon, I voted against auto-connection.

What’s the time frame expected to have the snap working out of the box?

The cost of teaching people to connect that interface to play a game is not worth the benefit of having a game not being able to manipulate priorities of processes on a gaming machine, IMO. If we’re solving the problem in the next few days, then sure, let’s wait. If it’s a more theoretical “changing the code can fix the problem”, can we please have this working out of the box sooner?

1 Like

@tyhicks: any progress on those upcoming changes to seccomp?

I was able to have a hangout with Kees Cook (Seccomp maintainer) late last week and we came up with an idea that may allow our use case to be supported without muddying up the Seccomp kernel API too badly. I’m going to be working on patches that implement that design starting this week.

Note that this last remaining bit of work is only for a userspace application to tell the kernel what seccomp actions should be logged. The rest of the seccomp kernel changes are already implemented. This last remaining bit of work lets snap-confine set up a sandbox that logs RET_ERRNO actions while launching a Chrome browser or Electron app which sets up its own sandbox that extensively uses RET_ERRNO actions without logging them (this would quickly flood the logs).

2 Likes

@jdstrand @tyhicks Bringing this topic back to life: given the solution evolution here, and that we’re unlikely to see that patch being applied to every other distribution timely, and that as described above the potential for abuse is minimal, can we please connect this interface automatically so the game works out of the box?

1 Like

I still maintain that the snap should be made to work in a sandbox-compatible way that doesn’t require process-control, and that process-control grants a lot more than what this snap needs.

However the seccomp work has taken a long time to upstream and I trust the publisher, so a reluctant +1.

That is two +1s so granting the snap declaration. It is in effect now.

1 Like

Thank you Jamie.

I verified that this works.

Making the snap work in a sandbox-compatible way would require patching libnspr4 which is a dependency of 0ad, not part of the app’s codebase itself. Not impossible, but quite a hassle.
I’ll keep an eye on the progress of the seccomp work, and as soon as it’s available I will remove the process-control interface from the app’s list of plugs.

FYI, with https://github.com/snapcore/snapd/pull/3998 installed (which changes KILL to EPERM), I disconnected the process-control interface and out popped the following EPERM denial:

Mar 06 08:35:45 sec-xenial-amd64 kernel: audit: type=1326 audit(1520346945.126:391): auid=4294967295 uid=1000 gid=1000 ses=4294967295 pid=11364 comm="pyrogenesis" exe="/snap/0ad/1/binaries/system/pyrogenesis" sig=0 arch=c000003e syscall=141 compat=0 ip=0x7fac57879d27 code=0x50000

and the application kept running. This makes sense because the default template allows setpriority for 0-19 and the kernel enforces via DAC/capabilities that non-root cannot use < 0, so the application tries to setpriority just in case it can, but doesn’t fail cause it normally can’t.

Put another way, this will fix setpriority for 0ad (and likely all non-root processes that try it) so that they won’t need process-control at all any more, let alone auto-connected.