Snap-confine issue when trying to run hello-world snap on a 3.4 kernel

@abeato and myself are trying to test out snap-confine to make sure that it is working on a 3.4 kernel with both apparmor and seccomp turned off in the kernel’s config since confinement just isn’t going to work on such an old kernel. We’re using the updated version of snap-confine from this PR: https://github.com/snapcore/snapd/pull/3243

What we’re seeing is the following output when running hello-world from the hello-world snap:

test@localhost:~$ /snap/bin/hello-world cannot open mount namespace of the init process (O_PATH): No such file or directory

This seems to be the line that prints this error but am wondering if anyone has any thoughts on what we might be missing from either our kernel config or how we’re mounting the filesystem for the device target?

I guess the problem here is that / isn’t mounted with --make-rshared as that is also what the code linked suggests. For Ubuntu Core this should be normally done by the initramfs or systemd but it could be that there are a few problems with the 3.4 kernel. Leaving that for @zyga-snapd to comment. According to https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/tree/Documentation/filesystems/sharedsubtree.txt?h=linux-3.4.y shared-subtree support should be in 3.4

1 Like

It actually goes forward with kernel in master. But, there are other issues, see https://github.com/snapcore/snapd/pull/3300

Just to quote what I said in the PR:

The second issue is most likely caused by the use of a 3.4-based kernel (patched) that doesn’t support fchdir(2) on a file descriptor obtained with open(2) using the O_PATH flag. We are confirming it now.

https://github.com/torvalds/linux/commit/332a2e1244bd08b9e3ecd378028513396a004a24 works as expected, thanks @zyga-snapd !

And now getting :wink: :

$ hello-world cannot bind-mount the mount namespace file /proc/1602/ns/mnt → hello-world.mnt: Invalid argument support process for mount namespace capture exited abnormally

Working on that now

Great, can you add the strace for this please?

@zyga-snapd: https://pastebin.canonical.com/187995/

So this fails on [pid 1760] mount("/proc/1755/ns/mnt", "hello-world.mnt", NULL, MS_BIND, NULL) = -1 EINVAL (Invalid argument)

I kind of suspect that we’re missing more kernel patches.

This is what we’ve applied to the kernel so far: https://lists.linuxfoundation.org/pipermail/containers/2012-November/031023.html

I think there may be a separate patch that relates to using bind mounts to capture namespace objects. I really think that at this stage you should have a kernel person look at that though.

1 Like

@shrirang @timchen119 : any thoughts here?

You can also experiment with simple tools like mount and nsenter to see if this part of the kernel works. If it works with nsenter let me know and we can look for anything that snap-confine can learn and do differently.

1 Like

Sounds good, thanks @zyga-snapd

@zyga-snapd, this code is interesting: http://stackoverflow.com/questions/34783391/why-i-couldnt-use-mount-bind-proc-pid-ns-mnt-to-another-file-in-ubuntu

Running it in the device:

$ sudo ~/test-mount
mount failed: Invalid argument
$ sudo ~/test-mount 1
mount ok

Ah, indeed. I’m aware of this but I forgot to mention (this is why snap-confine has the extra logic to fork and then capture the mount namespace of a process that unshared).

But why it is working if I run the sample code but not from snap-confine? :slight_smile:

1 Like

No idea, but if you look at snap-confine’s ns-support.c we do largely the exact same thing.

For some reason the kernel thinks that the failing mount would provoke a mount loop. Removing the check from

fs/namespace.c:do_loopback()

and with an additional hack to remove loading the seccomp profile in snap-confine, finally snap run says hello:

test@localhost:~$ hello-world
Hello World!

Of course removing a check is NOT a solution, but we are nearer now to the real fix…

Besides, I have enabled SECCOMP in the kernel, but anyway snap-confine is not able to load the seccomp profile. This needs further investigation.

As discussed on IRC, you need seccomp mode 2 which was introduced in 3.5.

Could we get a patch for snap-confine to allow for a system to disable seccomp and still have snap-confine be ok with this? This would be the equivalent patch to the one that @zyga-snapd already did for us when apparmor is disabled.