Snap-confine issue when trying to run hello-world snap on a 3.4 kernel

jhodapp · May 10, 2017, 6:07pm

@abeato and myself are trying to test out snap-confine to make sure that it is working on a 3.4 kernel with both apparmor and seccomp turned off in the kernel’s config since confinement just isn’t going to work on such an old kernel. We’re using the updated version of snap-confine from this PR: https://github.com/snapcore/snapd/pull/3243

What we’re seeing is the following output when running hello-world from the hello-world snap:

test@localhost:~$ /snap/bin/hello-world cannot open mount namespace of the init process (O_PATH): No such file or directory

This seems to be the line that prints this error but am wondering if anyone has any thoughts on what we might be missing from either our kernel config or how we’re mounting the filesystem for the device target?

morphis · May 11, 2017, 6:07am

I guess the problem here is that / isn’t mounted with --make-rshared as that is also what the code linked suggests. For Ubuntu Core this should be normally done by the initramfs or systemd but it could be that there are a few problems with the 3.4 kernel. Leaving that for @zyga-snapd to comment. According to https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/tree/Documentation/filesystems/sharedsubtree.txt?h=linux-3.4.y shared-subtree support should be in 3.4

abeato · May 11, 2017, 10:09am

It actually goes forward with kernel in master. But, there are other issues, see https://github.com/snapcore/snapd/pull/3300

zyga-snapd · May 11, 2017, 11:02am

Just to quote what I said in the PR:

The second issue is most likely caused by the use of a 3.4-based kernel (patched) that doesn’t support fchdir(2) on a file descriptor obtained with open(2) using the O_PATH flag. We are confirming it now.

abeato · May 11, 2017, 1:03pm

https://github.com/torvalds/linux/commit/332a2e1244bd08b9e3ecd378028513396a004a24 works as expected, thanks @zyga-snapd !

And now getting :

$ hello-world cannot bind-mount the mount namespace file /proc/1602/ns/mnt → hello-world.mnt: Invalid argument support process for mount namespace capture exited abnormally

Working on that now

zyga-snapd · May 11, 2017, 1:22pm

Great, can you add the strace for this please?

abeato · May 11, 2017, 1:24pm

@zyga-snapd: https://pastebin.canonical.com/187995/

zyga-snapd · May 11, 2017, 1:25pm

So this fails on [pid 1760] mount("/proc/1755/ns/mnt", "hello-world.mnt", NULL, MS_BIND, NULL) = -1 EINVAL (Invalid argument)

I kind of suspect that we’re missing more kernel patches.

jhodapp · May 11, 2017, 1:39pm

This is what we’ve applied to the kernel so far: https://lists.linuxfoundation.org/pipermail/containers/2012-November/031023.html

zyga-snapd · May 11, 2017, 1:52pm

I think there may be a separate patch that relates to using bind mounts to capture namespace objects. I really think that at this stage you should have a kernel person look at that though.

jhodapp · May 11, 2017, 1:54pm

@shrirang @timchen119 : any thoughts here?

zyga-snapd · May 11, 2017, 1:58pm

You can also experiment with simple tools like mount and nsenter to see if this part of the kernel works. If it works with nsenter let me know and we can look for anything that snap-confine can learn and do differently.

jhodapp · May 11, 2017, 1:58pm

Sounds good, thanks @zyga-snapd

abeato · May 11, 2017, 2:28pm

@zyga-snapd, this code is interesting: http://stackoverflow.com/questions/34783391/why-i-couldnt-use-mount-bind-proc-pid-ns-mnt-to-another-file-in-ubuntu

Running it in the device:

$ sudo ~/test-mount
mount failed: Invalid argument
$ sudo ~/test-mount 1
mount ok

zyga-snapd · May 11, 2017, 2:31pm

Ah, indeed. I’m aware of this but I forgot to mention (this is why snap-confine has the extra logic to fork and then capture the mount namespace of a process that unshared).

abeato · May 11, 2017, 2:32pm

But why it is working if I run the sample code but not from snap-confine?

zyga-snapd · May 11, 2017, 3:34pm

No idea, but if you look at snap-confine’s ns-support.c we do largely the exact same thing.

abeato · May 12, 2017, 12:28pm

For some reason the kernel thinks that the failing mount would provoke a mount loop. Removing the check from

fs/namespace.c:do_loopback()

and with an additional hack to remove loading the seccomp profile in snap-confine, finally snap run says hello:

test@localhost:~$ hello-world
Hello World!

Of course removing a check is NOT a solution, but we are nearer now to the real fix…

Besides, I have enabled SECCOMP in the kernel, but anyway snap-confine is not able to load the seccomp profile. This needs further investigation.

jdstrand · May 12, 2017, 12:46pm

As discussed on IRC, you need seccomp mode 2 which was introduced in 3.5.

jhodapp · May 12, 2017, 1:13pm

Could we get a patch for snap-confine to allow for a system to disable seccomp and still have snap-confine be ok with this? This would be the equivalent patch to the one that @zyga-snapd already did for us when apparmor is disabled.