Pulseaudio recording

jdstrand · July 12, 2018, 4:15pm

I looked into the status of pulseaudio and mediating access to audio recording and submitted https://github.com/snapcore/snapd/pull/5504. To quote the PR:

"
When the pulseaudio interface was first written, it was planned that it would only support playback and that recording would come later. This was reflected in the interface documentation at Interface management until it was corrected today.

On Ubuntu Touch, pulseaudio was modified to include patches that would not allow record but these patches were never upstreamed. Classic Ubuntu, its flavors and derivatives (since Ubuntu 16.10) include a patch to deny audio recording unconditionally when the connecting application is a snap (https://launchpad.net/bugs/1583057), but this patch is broken (https://launchpad.net/bugs/1781428).

Upstream prefers to implement access controls instead of the patch included in
Ubuntu:

and patches have been submitted, but still not upstreamed:

Fedora 26 and later apparently have these patches with portals now supporting
pulseaudio:

https://github.com/matthiasclasen/xdg-desktop-portal/tree/device

Until snapd/pulseaudio can mediate access to recording everywhere, we should be
clear on what may be allowed.
"

So, the status is essentially that upstream does not include the necessary patches to mediate recording and Fedora and Ubuntu are doing something different (or at least Ubuntu tries to). Ubuntu’s unconditional deny patch should not be disabled at this time since strict mode applications like chromium, firefox, etc would no longer be able to record audio.

IMO, snapd should support something like:

plugs:
  pulseaudio:
    allow-record: true|false

where playback continues to be auto-connected and specifying allow-record would trigger manual review.

@willcooke (cc @niemeyer) - is this something that the desktop team could look into and put on the Roadmap to drive it to conclusion? This is IMO an important confinement gap that needs to be addressed. Perhaps @jamesh has some insight with his recent work on portals?

niemeyer · July 12, 2018, 4:56pm

There are two important problems with the suggested approach at the end:

It doesn’t give a choice for the user to allow playing but not recording, which is the most common case.
There’s no hint when connecting the interface if one is allowing recording rather than just playback.

These are serious enough that it doesn’t look viable to have a single interface for both.

jdstrand · July 12, 2018, 5:06pm

@niemeyer - I would be fine with ‘pulseaudio-record’ as a separate interface. That said, we would probably want to think about how to transition people. Today, ‘pulseaudio’ is both recording/playback so maybe instead we create pulseaudio-playback that is auto-connectable, then leave ‘pulseaudio’ as is and (eventually) make it manually connected.

IME, the real problem is that pulseaudio upstream still doesn’t support anything like this and someone needs to drive it. Flatpak on Fedora seem to have some sort of mediation in place. I suggest someone looks at that since that is what upstream pulseaudio is more likely to take.

niemeyer · July 12, 2018, 5:11pm

We can probably phase out this interface without any breakages. Let’s talk about it next week.

Is Ubuntu able to mediate right now, separating recording from playback?

willcooke · July 12, 2018, 5:18pm

We can certainly look in to this, once the details are worked out. @kenvandine and I will be in Montreal next week so let’s talk more then. PulseAudio is (eventually) going to be replaced by PipeWire (https://pipewire.org/) so we should make sure we contribute to that as well.

jdstrand · July 12, 2018, 5:26pm

Not atm. There were patches to pulseaudio for Touch that made it use the trust-store, but this isn’t installed anywhere any more. The patches that are in Ubuntu 16.10+ are broken and only unconditionally denied recording (they are broken in that the build disables ‘snappy support’ even though --enable-snappy is given to configure). IMO, we can’t simply turn those on without regressing people.

What we need is to design how we really want this to work. I suspect the flatpak approach is somehow using bind mounts for the /run/user/1000/pulse/native socket (someone would need to investigate) where the pulseaudio server somehow can differentiate between bind mounts that allow playback and allow playback/recording, and if we wanted, snapd could accommodate something similar via snap-confine/snap-update-ns and a patched pulseaudio. Alternatively, a differently-patched pulseaudio could reach out to snapd (userd?) to ask if recording is allowed (this is closer to the Ubuntu Touch/trust-store approach). There are perhaps other approaches.

It sounds like Fedora took the non-upstream access control hooks patches and applied them to their pulseaudio. Ubuntu and other distros could do the same and adjust to have a snapd backend for deciding when to allow recording. Reading the freedesktop spec, it looks like there are a lot of mediation points that we might want to consider in the design (even if we only implement playback/record at first).

I’m not a pulseaudio or portals expert though, so I think it is very important we get someone from the desktop team involved to participate in the design.

jdstrand · July 12, 2018, 5:34pm

At first blush and not having investigated, this seems like the most likely path forward, or to at least investigate first. If upstream is reluctant to take the patches, then Fedora and Ubuntu carrying the same patches might help getting them or a variation of them upstream. That would then allow a snapd backend to be more easily pulled upstream (since it is simply something that would be using the existing access control hooks).

An alternative approach would be to proxy pulseaudio in some manner. I’m not sure how feasible this is in terms of performance or audio syncing.

jamesh · July 16, 2018, 4:51am

All else being equal, I tend to prefer solutions that allow users to make just-in-time access decisions for access to resources we don’t want to enable by default. That gives the user context to the decision, and avoids prompting the user if they never use the part of the application that requires access.

As far as Fedora goes, I see that they did patch their pulseaudio a year ago:

https://src.fedoraproject.org/rpms/pulseaudio/c/b0ed4f1fdad9293612f5d070e094106df7a2c425?branch=master

… but the changes were dropped 5 months later when they upgraded their PulseAudio:

https://src.fedoraproject.org/rpms/pulseaudio/c/db7177be890de2677a4abf18678ca8ffd3cd4252?branch=master

The referenced bug report says they planned to port the patches forward, but it doesn’t seem that has happened. There’s no evidence of support in upstream PulseAudio or in Fedora’s current set of patches, so I don’t think we can treat it as a solved problem there.

With that said, the patch set seems like a reasonable way of handling the problem when a single socket is used to access all playback and recording devices on the system. If the patch set is resurrected, it seems like it wouldn’t be too difficult to adapt for snaps. It would likely require some changes on both the PulseAudio and xdg-desktop-portal sides though:

the PA patches perform a cgroup check on the peer pid to see if it looks like a flatpak confined app prior to asking xdg-desktop-portal whether the pid should have access.
on the xdg-desktop-portal side, I don’t think the existing patches support determining snap application ID from process ID: only security label. I’m going to be working to improving the snap support in xdg-desktop-portal, so I’ll keep this in mind.

jamesh · July 16, 2018, 5:00am

Looking at the Pipewire source, it is making use of the same xdg-desktop-portal API as the (dropped) PulseAudio patches were:

https://github.com/PipeWire/pipewire/blob/master/src/modules/module-flatpak.c

Again, there is some flatpak specific code to check whether the client is likely to be sandboxed by Flatpak prior to the AccessDevice method call (this time checking for /.flatpak-info in the process’s mount namespace), so we would need changes there to support snaps.

Lin-Buo-Ren · July 16, 2018, 10:19am

Well the documentation did mention that the pulseaudio interface doesn’t allow audio recording all along so I believe the breakage is acceptable as the functionality is never guaranteed.

The feature that allows recording can be considered to be a big privacy hole that should be patched right away. The recording snaps probably should wait instead.

jdstrand · July 18, 2018, 1:41am

We can’t break existing snaps, but we will address the lack of audio recording mediation in some manner.

jamesh · August 7, 2018, 10:36am

I’ve been updating the PulseAudio snap policy module to help with this. My work is available here:

This changes the policy module from simply denying audio recording access to snaps to instead enumerate the snap’s plugs, and check to see if it has a connected plug for either the pulseaudio or audio-record interfaces.

The audio-record interface doesn’t exist yet, but this was based on @jdstrand’s idea to add new locked down audio-playback and audio-record interfaces. If we want to keep our options open to move to PipeWire in the future, these interfaces should probably not be considered to be PulseAudio specific. So we probably want to encourage snap developers to program to the ALSA API, and make it easy to build against a libasound configured as such.

jdstrand · August 7, 2018, 1:24pm

Thanks for this!

At the sprint, it was decided between @niemeyer, myself, @zyga-snapd, @willcooke and @kenvandine that we would indeed introduce the ‘audio-record’ and ‘audio-playback’ interfaces, and keep ‘pulseaudio’. Today, audio-record/audio-playback would contain policy for pulseaudio recording and playback, respectively (though see below), and we would add pipewire or anything else with mediated playback and record to these interfaces as appropriate going forward. The ‘pulseaudio’ interface would allow both playback and record for pulseaudio specifically, and we would deprecate its use going forward (ie, adjust templates to use ‘audio-playback’ instead of ‘pulseaudio’, etc). The description of your implemention is completely inline with these decisions where pulseaudio is doing all the mediation (so the security policy ends up being the same in all 3 interfaces).

As for ALSA, we already have an alsa interface and people are free to use it, but it of course has direct access to the audio devices and therefore requires manual connection. Using a mediated sound service (in this case, pulseaudio, but in the future pipewire or whatever) is key to security, in this case, audio-playback is auto-connected and audio-record is manually connected. I don’t think we should encourage people to use ALSA instead of pulseaudio (or pipewire) since it is uneeded for the common case of playbackc and simple recording. Advanced audio applications (eg, jackd, etc) can of course target ALSA.

I’ve not looked at your pulseaudio code, but it sounds like the next step is a PR to snapd that defines pulseaudio plugs policy in pulseaudio_common.go, then you modify pulseaudio.go to import that, and create simple audio_playback.go and audio_record.go that do the same. Please ping me if you need help with this or when you have a PR to review.

Thanks again!

jdstrand · August 13, 2018, 10:59pm

FYI, https://github.com/snapcore/snapd/pull/5644 audio-playback/audio-record, but it is still under review.

@jamesh - one thing that occurred to me, we didn’t call out in the above and I didn’t see in your patch to pulseaudio for cosmic (maybe I missed it; it was a quick look): classic snaps will have a security label but they won’t have any connected interfaces. Will your patch to pulseaudio handle classic snaps (eg, like skype, slack, etc) ok?

jamesh · August 14, 2018, 4:49am

I hadn’t considered the classic snap case, so I guess we’ll need to get another revision of the pulse out before backporting to bionic and xenial.

I had tried Skype, but on closer inspection it appears to be bypassing Pulse Audio and using the sound devices directly. I suspect it would have failed if it was using PA.

Just so we don’t miss anything out, here’s a description of what the policy module is currently doing:

Determine whether the client is a snapped application, and what the snap name is:
- If !aa_is_enabled(), assume it isn’t a snap.
- Call aa_gettaskcon(), and if that fails assume it isn’t a snap.
- If the label is "unconfined" or does not begin with "snap.", assume it isn’t a snap.
- Find the first dot in the label after "snap.". If there isn’t one, assume it isn’t a snap.
- At this point, we have a snap client whose name is everything after "snap." up until the next dot.
If the client is not a snap, allow recording.
If the client is a snap, asynchronously query snapd to find out whether the snap is allowed to record:
- Call the /v2/interfaces REST API.
- If the call fails, deny recording access.
- If it succeeds, iterate through the plugs array:
  - If snap matches our snap name, interface is one of "pulseaudio" or "audio-record", and the connections array is non-empty, then allow access.
- If there are no matching plugs, deny access.

So the proposed change now would be to add a call to /v2/snaps/${snap_name} before the interface check, and if confinement is classic allow access. Does that sound correct, or do we need any more changes?

jamesh · August 14, 2018, 5:00am

Just a follow up to this: I am not suggesting we encourage snaps to bypass the sound server. Rather I am suggesting we encourage them to program to the ALSA library API. On a traditional Ubuntu desktop, if a piece of software is written against that library and uses the default device then it will be using Pulse Audio via the libasound_module_pcm_pulse.so module.

Assuming PipeWire ends up providing an equivalent module, that same program could talk to that sound server without source code modification or even recompilation.

The open question is whether we could provide an ALSA configuration that would contain modules for Pulse and PipeWire, and select the sound server that was actually in use.

jdstrand · August 14, 2018, 1:03pm

Regarding skype> oh, that’s right, I forgot one of the reasons they are classic is they had trouble getting skype to use pulseaudio (and had trouble configuring alsa correctly to use pulse).

Re the proposed change> yes, before the interface check, if confinement is classic, then allow access.

Which then made me think of devmode> you could just check if ‘classic || devmode’ and allow it, but it might be nice to with devmode, log if not connected. I don’t think this is strictly required, but would be nice.

jdstrand · August 14, 2018, 1:04pm

Ah, yes, I forgot about how alsa can forward to pulseaudio. Sure, this sounds like a great recommendation.

lucyllewy · August 14, 2018, 1:19pm

that’s what my alsa remote parts are supposed to fix, but we’re waiting on some snapcraft changes to actually get it usable: https://github.com/diddledan/snapcraft-alsa

bug: https://bugs.launchpad.net/snapcraft/+bug/1766878

jamesh · August 16, 2018, 8:48am

I’ve been testing out an improvement to the snap policy module that makes the proposed changes:

With these changes, I’m able to record audio when running with a classic snap’s AppArmor profile, while the -0ubuntu2 package prevents it. I think this covers all the cases we needed, so I guess it is time to start work on backporting.