Guidance on snap interface to load a device-tree overlay on RPi Core18

alexmurray · February 22, 2019, 1:42pm

The Iridium SLB9670 TPM2 is an evaluation board designed for the RaspberryPi 2/3 - and the driver for this is already in the bionic/raspi2 aka core18 kernel tree however this also requires a device-tree overlay which is not present in the core18 kernel tree AND even if it was, a change would be needed to the /boot/uboot/config.txt bootloader configuration to enable this to be loaded.

When this overlay is loaded, the SPI interface is then claimed by the TPM2 SPI driver and so even if the TPM2 device is not present, this stops the SPI interface being used by other drivers, and so it is not suitable for this to be enabled for all users of Ubuntu Core on the Raspberry Pi.

However, device tree overlays can be dynamically loaded on-the-fly, so I have developed a snap application which ships with the required device tree overlay and some simple scripting to dynamically load this, along with the required kernel module tpm_tis_spi as a snap one-shot daemon.

We already provide the kernel-module-control interface to allow snaps to load modules.

The problem I am facing is that there is no snap interface which exposes the required configfs path within sysfs to allow the device tree overlay to be loaded - so currently AppArmor is blocking access to this, and as a result the snap must be a devmode snap.

To enable this to be strictly confined, either this path needs to be added to an existing interface (perhaps kernel-module-control ?) or a new interface (device-tree-control ?) is needed to support this use-case. The other option I had considered, but discarded, was some interface which allowed to edit /boot/uboot/config.txt to list the new overlay AND to allow writing of the shipped device-tree overlay to /boot/uboot/overlays however this is likely to be clumsy (rewriting conf files) and gives too much authority to the snap to cause other problems by changing the general bootloader configuration.

jamesh · February 22, 2019, 1:55pm

Could you use the system-files interface? Something like the following:

plugs:
  system-files:
    write:
    - /sys/kernel/config/whatever

Use of this interface requires a store assertion to grant access, but this is probably easier than building a brand new interface type.

alexmurray · February 22, 2019, 2:00pm

Ok will give it a go.

FYI the semantics are a mkdir /sys/kernel/config/device-tree/overlays/$FOO at which point the kernel automatically creates a node /sys/kernel/config/device-tree/overlays/$FOO/dtbo which we then need to write the device-tree blob into - so will system-files support this workflow?

jamesh · February 22, 2019, 2:09pm

A plug defined like:

system-files:
  write:
    - /sys/kernel/config/device-tree/overlays

Should add the following snippet to your AppArmor profile:

/sys/kernel/config/device-tree/overlays{,/,/**} rwkl,

Perhaps the quickest way to test this would be to add the line to your snap’s policy in /var/lib/snapd/apparmor/profiles, and then reload the policy with sudo apparmor_parser -r <filename>.

Play around with the path to see what works, and then try using that in the plug.

alexmurray · February 27, 2019, 5:46am

Thanks for the help, but I am not seeing anything in the AppArmor profile after adding this stanza - does this only get added once the exception has been granted by the store? @jdstrand - any ideas?

jamesh · February 27, 2019, 6:00am

Is the plug connected? It isn’t enough for it to simply be present on the snap.

ijohnson · February 27, 2019, 11:24am

This sounds like the kind of thing that would necessitate building your own gadget/image. I personally have never fiddled with creating a gadget snap with a different device tree overlay, but you can see how it’s done for a raspberry pi 3 here: https://github.com/snapcore/pi3-gadget/blob/master/snapcraft.yaml#L58
I imagine all you would need to do is stage the overlay you need into $SNAPCRAFT_PART_INSTALL/boot-assets/overlays/your-favorite-overlay.dtb.
There you can also see how to modify the /boot/config.txt, though admittedly I’m not sure if that is different from the /boot/u-boot/config.txt (I think it’s the same thing but I’m not sure).

alexmurray · February 27, 2019, 12:52pm

No since I can’t actually successfully install the snap with strict confinement (since I want this to be a daemon oneshot service - so on snap install it tries to run the service, this fails due to AppArmor denials, so the snap is considered failed to install) - and so then I can’t even manually connect the interface. So the other option then is to not make it a oneshot service but then we lose the automagic hardware enablement which is the whole point of the snap.

alexmurray · February 27, 2019, 12:54pm

Sure but do we really envision N different custom gadget snaps for the RPi for all the various N hardware devices which people might want to add on (and each published by some random publisher)? I am pretty close to achieving this - just hindered by not having the store assertions in place from what I can tell to get it to autoconnect on install and allow the system-files / module-control permissions.

chipaca · February 27, 2019, 12:56pm

For testing, your install hook should be able to do a snapctl stop --disable the-daemon, if that helps.

alexmurray · February 27, 2019, 2:02pm

Thanks for the suggestion - with this in place it now works as strictly confined after manually installing the snap, connecting the interfaces and then reenabling:

snap install ./wfuhuMg7FCMYJGocE7AUGCdiW6vAkiH8_19.snap --dangerous
snap connect rpi-tpm2-slb9670-hwe:system-files :system-files
snap connect rpi-tpm2-slb9670-hwe:kernel-module-control :kernel-module-control
snap start --enable rpi-tpm2-slb9670-hwe

And then it automatically loads on boot as expected too!

alexmurray · February 27, 2019, 2:55pm

@jdstrand can you comment on whether system-files for write to all of the device-trees overlays path is reasonable in this case (as @jamesh suggested in comment 4 above Guidance on snap interface to load a device-tree overlay on RPi Core18)

zyga-snapd · February 27, 2019, 3:56pm

I am worried about the consequences of using system files and for injecting the dbt this way. What is the impact on the SPI interface in your overlay? What is the general impact in case a different overlay is used?

While inconvenient I strongly prefer to force this to go through a new model, a new gadget and, if required, changes to snapd to inhibit creation of implicit slots via gadget language.

ijohnson · February 27, 2019, 4:37pm

Well it makes sense to me that you would have your own gadget for this specific type of device, since it requires such extensive changes like loading kernel modules, modifying the boot config.txt, and adding it’s own device tree overlay. Other simpler devices that just use pre-existing /dev/xyz should be usable from the slot interfaces exposed by the gadget such as gpio, i2c, etc.

To me, the ideal thing would be to have all the support for this device baked into the gadget (either a custom gadget or perhaps upstream support for this device into the official gadget since it seems well supported upstream), because providing all these permissions to a snap seems excessive. If necessary, one could set something up such that the device tree overlay is only loaded dynamically by the gadget snap or by snapd when a slot exposed by the gadget for this specific device is connected to something so that a snap could consume this device specifically. This way the snap that wants to talk to this TPM module would simply plug spi-tpm (strawman name) which is slotted by the gadget snap, and upon connection the device tree overlay is loaded and this driver takes over the SPI bus.

This point specifically is why I think that the behavior of loading this driver and device tree module shouldn’t be allowed by a snap, because if a user has a snap installed that’s using the SPI bus and is connected to the spi slot from the gadget snap, why should another snap be allowed to “take over” the SPI bus by loading this driver + device tree overlay? I think that since there’s a conflict between normal SPI usage and this device that conflict should be mediated by snapd in some way, or at least handled in the gadget so that the only SPI slot exposed is for this device and only is connected to the specific consumer of this snap and there’s no opportunity for another snap to connect to the SPI bus normally.

ijohnson · February 27, 2019, 4:38pm

Also note that last I heard, the general recommendation was never to use kernel-module-control, and if you need a custom kernel module, you needed to put the kernel module into a custom kernel snap and use that on the core device, simply because kernel-module-control is so powerful.

alexmurray · February 28, 2019, 4:34am

@ijohnson - thanks for the advice. I am still having trouble understanding how a specific gadget snap is the best idea here - so in this case we added a TPM module so we need to use gadget snap pi-tpm (strawman name) - but then if I add some other module as well which needs some custom device-tree / driver elements etc - then surely we should have a second gadget snap which adds support for that device in the same manner as the proposal for a specific gadget snap for this device - but we can only have 1 gadget snap installed - so then we have a combinatorial explosion of NxN gadget snaps needed to be produced to support all the different hardware elements which might be added. So hence why I really want to try and pursue this idea of a single snap being able to do hardware enablement on its own to support cases like this where the devices are removable etc.

As for the device claiming the SPI bus etc - there can only be one device connected to the SPI interface - and so then it is up to the user to ensure they don’t install a hardware enablement snap for a piece of hardware which is not connected - OR in that case to disable it.

Finally, for kernel-module-control - I really want to try and avoid having to have custom kernels / gadgets just to enable a particular hardware add-on module - I just can’t see how this is scalable - whereas the case of an additional snap which simply does the small amount of work to enable the hardware (as is the case here) seems like a much simpler solution.

jdstrand · March 5, 2019, 5:40pm

Well, that is up for discussion with the reviewers, but I would actually kick this up to an architect and have @pedronis and/or @niemeyer comment.

@pedronis/@niemeyer: the short summary is basically if it is ok for a snap to modify the device-tree overlay and if so, how this is exposed to snaps? If not, how to support @alexmurray’s use case?

For the first question, AIUI, while the device tree mechanism is path-based and therefore potentially apparmor-friendly, the paths used can be arbitrary (@alexmurray - correct me here) such that different tpm add on boards will use different overlay paths. This means we can’t really have a function-specific interface like device-tree-tpm2 or similar (the idea being that the snap would have access to only this area, or ideally snapd would create the overlay on behalf of the snap). Because we can’t easily do this, the result is we end up granting very powerful rules for modifying many things in the device tree, which is something we’ve avoided since, again AIUI, you can destroy the system and/or completely take it over with arbitrary device tree overlay access.

For the second question, I agree with @zyga-snapd that something else is probably required, that needs design. Perhaps in the shortest of terms we can grant @alexmurray’s snap the ability to write to the specific area his snap needs to write to for this add on board via the system-files interface. This is not maintainable long term though and there is no mechanism for saying that the other spi devices will be inaccessible to snaps that might have the spi interface connected. So there are several things to consider IME:

gracefully dealing with device overlays that break other interfaces (eg, this tpm device overlay claiming spi)
how to support device overlays for arbitrary add on boards for IoT tinkerers
how to expose this to snaps

When considering this, I think it is vitally important to acknowledge that we have the gadget snap mechanism that should work perfectly for this for production IoT devices that will be deployed in the field. What is missing is something to accommodate developers/tinkerers who are using the raspi or similar to prove their designs/etc and are using flexible hardware that supports add-on boards.

While writing this, for ‘2’ and ‘3’ I was thinking that snapd should do the device overlay operation on behalf of the snap. Perhaps this is a new device-overlay backend that interfaces could tap into where snapd upstream enumerates the add-on boards that we support and their corresponding overlays which might tie into the warnings system for things like claiming all of spi. Another idea is to simply acknowledge that this is all officially supported via the gadget snap for proper IoT devices, but expose a snap set command in the core/snapd snap that tinkerers/developers can use to feed a device overlay to the device that snapd will manage and create on boot (this could even be part of the revert logic such that snapd would back out a device overlay if the system failed to boot). There are likely other ideas…

ijohnson · March 5, 2019, 6:10pm

So the way that I have seen this work in commercial deployments is that you have a single logical “device” with all the “sub-devices” added to the gadget snap - i.e. your single gadget specific to this deployment has the TPM drivers, maybe some special GPIO pins, etc. all put into the gadget snap (or possibly the kernel snap for out-of-tree drivers and such) and exposed via slots in the gadget snap for userspace snaps to connect to. So this way instead of having all the combinatorial explosion gadget snaps, you would only have as many gadget snaps as you have actual “sub-devices” connected to the logical “device”.

This handles both the case of prototyping/development, as you can easily create your own image from the gadget with ubuntu-image, as well as commercial deployment as with a commercial deployment you will have a finite set of actual models you deploy and you can either handle all cases with a single gadget or maintain separate gadgets for the set of models you maintain.

I will grant that building a custom image/gadget snap is potentially more work than it needs to be, but better tooling around that is IMO the better, more maintainable solution than providing all these accesses to snaps.

jdstrand · March 5, 2019, 7:11pm

Maybe I’m missing something but the issue isn’t about connecting the interfaces, it is about conditionally applying the device tree overlay dependent on when the tpm add-on board is present. This is needed because not doing so means that SPI will be broken for everyone that uses this gadget. I’m told (@alexmurray, correct me), that board presence is not discoverable before the device tree overlay is applied. This makes it impossible to have a single gadget snap that includes this overlay that has functional spi for anything other than tpm.

(In saying that, I wonder if it is possible to unconditionally apply the overlay, then detect if the device is there, then remove the overlay. Does this make SPI functional again? That is admittedly a bit wonky.)

alexmurray · March 6, 2019, 12:13am

So there are a couple issues here as @jdstrand highlights - do we want to support allows non-gadget snaps acting in this ‘dynamic’ gadget role? @pedronis @niemeyer can you comment on this at all? If not then the only way forward is a new gadget snap specifically to enable this hardware device, however as outlined above I can’t see how this is scalable across the RPi ecosystem where there are potentially many different add-on devices that might be used and combined in arbitrary ways. Also if a gadget snap is the only way forward, would Canonical publish this (so that it could get maintained along with the standard pi gadget snap)?

Much easier IMO if we can find a way for snaps to perform this hardware enablement role directly - is a lot more flexible for this kind of use-case. If we do then want snaps to be able to perform some of this functionality, what kind of interface do we expose? @jdstrand is right that allowing dynamic loading of arbitrary device-tree overlays is quite a powerful functionality that we can’t easily constrain.

The other option is to make this not so dynamic, where a snap could install a device-tree overlay (these live under /boot/uboot/overlays) and then snapd could enable this by directly editing /boot/uboot/config.txt to have the boot process load the new overlay at boot time, rather than leaving it up to a snap to do it at runtime. This then makes the device available earlier in the boot sequence so could be more flexible. Also this allows an end-device user to more easily introspect the installed overlays by simply looking in the usual location for them. From a security and confinement point-of-view, this is still no less powerful than the dynamic approach BUT it does mean snapd could have some say in the process if desired (introspect the overlay etc?).