I’ve been investigating how confinement works when you launch a docker workload from a snap that can access the docker snap via the docker interface , and it very quickly became obvious that the container is launched directly by the docker daemon and there run entirely under the confinement provided by the docker snap, and has no relation to any of the confinement in the snap using the docker interface. Therefor all containers run under the docker snap confinement.
Someone please correct me if I’m wrong on that point, perhaps I’m missing something, but it seems fairly cut and dry
In that case, and especially as the docker-support interface seems pretty privileged, we need to implement AppArmor and SecComp profiles in the docker layer to get some comparable security back for the containers themselves. Can I just expect this to work as normal if I define profiles at the docker layer, or is there anything to watch out for here ?
One of the first things that comes to mind is, can we somehow leverage the profiles generated by the snapd for the standard snap Interfaces ? I presume verbatim would be too much to ask for as they contain SNAP env variables etc , but with some light modification they could be used as inspiration at least ?
Any thoughts or experienced input would be very much appreciated.
This is correct, the container confinement is independent from the snap interface confinement specified by the snap that is launching the container.
This is not entirely true - the containers will run with whatever confinement the container has specified, by default this is under the docker-default AppArmor profile, which is very different from the AppArmor profile for the dockerd daemon from the docker snap itself (notably as you mention, the dockerd daemon’s AppArmor profile will include policy from docker-support which is super-privileged).
The most desirable thing to do security wise is to write or otherwise determine the minimum security policies for your containers and then provide these policies to dockerd such that dockerd applies those confinement policies to the container when it runs. This is not a trivial task, and is in fact one of the major reasons we suggest using snaps instead of docker containers because snap interfaces make expressing this policy easy and safe while with docker you are entirely on your own to figure out what is safe and what makes sense for a given application to access.
I will also point out that it’s unclear to me that dockerd has really ever been tested with applying these different profiles to containers, i.e. it could be that the docker snap is not currently allowed to use arbitrary security confinement policies for it’s containers - some of it certainly is configurable but I don’t think it’s quite as configurable as say an unconfined dockerd would be to launch container workloads, I think that was by design but that decision predates my involvement with the docker snap
Yeah, I totally get that. In this case there are other deciding factors influencing the choice to run some docker workloads.
I can verify that at least docker-default and unconfined AppArmor profiles work as advertised
I will try something custom and if it works. I guess then it’s just up to use to see if we can take any inspiration from the profiles generated by/for the interfaces.
Just thinking - are there any other parts of the confinement that may come into play here ? Presumably all the bind mounts etc still apply in the context of the docker snap ? Anything else that may catch us out ?
ah right thanks for pointing that out. In that case @jocado, you should be able to launch containers with any security policy as long as you ensure that the privileged-containers plug on the docker snap is connected on your device.
Well, the pieces in addition to apparmor I would think about would be:
device access, the container is likely put into a device cgroup, so if you need to access things in /dev, you need to add them to the device cgroup for the the container
seccomp syscalls, if you need more than the default you will need to configure this, but I don’t remember how fine grained this is, and you would need to be careful with arguments to syscalls, for example some syscalls seem harmless but can be powerful when given specific arguments
there could be more, not sure what all docker allows you to control.
Could you clarify what bind mounts you mean? You mean the mount namespace that dockerd runs inside of from the docker snap? If that’s what you mean, docker should always transition the container to run inside it’s own mount namespace always AFAIK (I don’t think you can turn this component off since it’s how the container gets the rootfs setup for it).
Thanks both @ijohnson and @jdstrand , it looks like the docker snap itself auto connects the privileged plug of docker-support, which explains why I was able to set the AppArmor profile to unconfined as a test:
Sorry, I wasn’t clear at all, but you still managed to confirm what I was asking really Docker processes have their own mount namespace, and that’s that.
It looks like for our first use case, the docker-default AppArmor profile is working at least [ well, not preventing any functionality required ]. Whether we ant to try and confine things further or not I’m not sure at this stage.