Qt 5.10 + linkat denials = broken KDE snaps

I’ve now also tried with --devmode which should disable confinement, but even then something is wrong with the syscall. It gets allowed but comes back -2, which is ENOENT (I think?) ultimately still rendering temporary files broken. This does work fine when using the same Qt build on xenial outside snapd though.

Mai 17 14:40:28 ajax audit[1750]: AVC apparmor="ALLOWED" operation="open" profile="snap.okular.okular" name="/etc/xdg/QtProject/qtlogging.ini" pid=1750 comm="okular" requested_mask="r" denied_mask="r" fsuid=1000 ouid=0
Mai 17 14:40:28 ajax audit[1750]: SYSCALL arch=c000003e syscall=2 success=yes exit=4 a0=717478 a1=80000 a2=1b6 a3=7179e0 items=1 ppid=11578 pid=1750 auid=1000 uid=1000 gid=1000 euid=1000 suid=1000 fsuid=1000 egid=1000 sgid=1000 fsgid=1000 tty=pts6 ses=2 comm="okular" exe="/snap/okular/x17/usr/bin/okular" key=(null)
Mai 17 14:40:28 ajax audit[1750]: AVC apparmor="ALLOWED" operation="open" profile="snap.okular.okular" name="/proc/sys/kernel/core_pattern" pid=1750 comm="okular" requested_mask="r" denied_mask="r" fsuid=1000 ouid=0
Mai 17 14:40:28 ajax audit[1750]: SYSCALL arch=c000003e syscall=2 success=yes exit=7 a0=7b3a18 a1=80000 a2=1b6 a3=7f1a48f6a0d2 items=1 ppid=11578 pid=1750 auid=1000 uid=1000 gid=1000 euid=1000 suid=1000 fsuid=1000 egid=1000 sgid=1000 fsgid=1000 tty=pts6 ses=2 comm="okular" exe="/snap/okular/x17/usr/bin/okular" key=(null)
Mai 17 14:40:28 ajax audit[1750]: AVC apparmor="ALLOWED" operation="link" info="Failed name lookup - deleted entry" error=-2 profile="snap.okular.okular" name="/home/me/snap/okular/x17/.local/config/session/#35531993" pid=1750 comm="okular" requested_mask="l" denied_mask="l" fsuid=1000 ouid=1000
Mai 17 14:40:28 ajax audit[1750]: AVC apparmor="ALLOWED" operation="link" profile="snap.okular.okular" name="/home/me/snap/okular/x17/.local/config/session/okular_10616a6178000152656082800000081640087_1526560828_381499" pid=1750 comm="okular" requested_mask="l" denied_mask="l" fsuid=1000 ouid=1000 target="/home/me/snap/okular/x17/.local/config/session/#35531993"
Mai 17 14:40:28 ajax audit[1750]: SYSCALL arch=c000003e syscall=265 success=no exit=-2 a0=ffffff9c a1=14cb1e8 a2=ffffff9c a3=14be428 items=2 ppid=11578 pid=1750 auid=1000 uid=1000 gid=1000 euid=1000 suid=1000 fsuid=1000 egid=1000 sgid=1000 fsgid=1000 tty=pts6 ses=2 comm="okular" exe="/snap/okular/x17/usr/bin/okular" key=(null)
Mai 17 14:40:28 ajax audit[1750]: AVC apparmor="ALLOWED" operation="link" info="Failed name lookup - deleted entry" error=-2 profile="snap.okular.okular" name="/home/me/snap/okular/x17/.local/config/#35531995" pid=1750 comm="okular" requested_mask="l" denied_mask="l" fsuid=1000 ouid=1000
Mai 17 14:40:28 ajax audit[1750]: AVC apparmor="ALLOWED" operation="link" profile="snap.okular.okular" name="/home/me/snap/okular/x17/.local/config/okularrc" pid=1750 comm="okular" requested_mask="l" denied_mask="l" fsuid=1000 ouid=1000 target="/home/me/snap/okular/x17/.local/config/#35531995"
Mai 17 14:40:28 ajax audit[1750]: SYSCALL arch=c000003e syscall=265 success=no exit=-2 a0=ffffff9c a1=14d35b8 a2=ffffff9c a3=1504d08 items=2 ppid=11578 pid=1750 auid=1000 uid=1000 gid=1000 euid=1000 suid=1000 fsuid=1000 egid=1000 sgid=1000 fsgid=1000 tty=pts6 ses=2 comm="okular" exe="/snap/okular/x17/usr/bin/okular" key=(null)

Note that seccomp complain mode is now implemented for kernels that support it (notice that the seccomp violations all list ‘success=yes’). This does not explain the -2 return.

Is it possible that an okular process was started and continued to run when you did a snap install --dangerous /path/to/new/okular/snap? If so, do things work correctly after you kill all okular processes and then start them anew?

If not, can you paste the following:

  • the output of snap version
  • the contents of /var/lib/snapd/apparmor/profiles/snap.okular.okular
  • the contents of /var/lib/snapd/seccomp/bpf/snap.okular.okular.src

Note that ‘linkat’ and ‘open’ are in the default seccomp policy. It is strange that you have denials for those. Also note that ‘l’ is allowed for SNAP_USER_DATA: owner @{HOME}/snap/@{SNAP_NAME}/@{SNAP_REVISION}/** wl,. These policy violations are odd.

As requested:

$ snap version
snap    2.32.6
snapd   2.32.6
series  16
debian  
kernel  4.16.0-1-amd64

This is on debian testing.

I am very certain okular wasn’t running across the install call(s).

For good measure I’ve now run install --dangerous on a cold booted system with definitely no okular running and it has the same problem.

I did notice something though. When attempting to print a PDF, the underlying code uses tempfiles in a much more straightforward manner as files in /tmp/. What struck me as odd was that the paths I see in the snap’s tmp (snap run --shell okular) do not match the paths logged by the audits. I am not sure if that is indicative of anything, but I found it a bit odd.

Here’s a short paste from the journal:

Mai 18 11:28:43 ajax audit[29318]: AVC apparmor="DENIED" operation="link" info="Failed name lookup - deleted entry" error=-2 profile="snap.okular.okular" name="/tmp/#48417903" pid=29318 comm="okular" requested_mask="l" denied_mask="l" fsuid=1000 ouid=1000
Mai 18 11:28:43 ajax audit[29318]: AVC apparmor="DENIED" operation="link" profile="snap.okular.okular" name="/tmp/okular_EWsALl.ps" pid=29318 comm="okular" requested_mask="l" denied_mask="l" fsuid=1000 ouid=1000 target="/tmp/#48417903"
Mai 18 11:28:43 ajax audit[29318]: SYSCALL arch=c000003e syscall=265 success=no exit=-2 a0=ffffff9c a1=2c6b2f8 a2=ffffff9c a3=2c6fcd8 items=0 ppid=6560 pid=29318 auid=1000 uid=1000 gid=1000 euid=1000 suid=1000 fsuid=1000 egid=1000 sgid=1000 fsgid=1000 tty=pts1 ses=3 comm="okular" exe="/snap/okular/x22/usr/bin/okular" key=(null)
Mai 18 11:28:43 ajax audit[29318]: AVC apparmor="DENIED" operation="link" info="Failed name lookup - deleted entry" error=-2 profile="snap.okular.okular" name="/tmp/#48417903" pid=29318 comm="okular" requested_mask="l" denied_mask="l" fsuid=1000 ouid=1000
Mai 18 11:28:43 ajax audit[29318]: AVC apparmor="DENIED" operation="link" profile="snap.okular.okular" name="/tmp/okular_xgIkSh.ps" pid=29318 comm="okular" requested_mask="l" denied_mask="l" fsuid=1000 ouid=1000 target="/tmp/#48417903"
Mai 18 11:28:43 ajax audit[29318]: SYSCALL arch=c000003e syscall=265 success=no exit=-2 a0=ffffff9c a1=2c6b2f8 a2=ffffff9c a3=2c6fcd8 items=0 ppid=6560 pid=29318 auid=1000 uid=1000 gid=1000 euid=1000 suid=1000 fsuid=1000 egid=1000 sgid=1000 fsgid=1000 tty=pts1 ses=3 comm="okular" exe="/snap/okular/x22/usr/bin/okular" key=(null)

During the entire print operation this was in tmp:

me@ajax:/tmp$ while [ 1 ]; do ls ; done
072865b025ae5
072865b026753
072865b02cef3
072865b0bf2b9
072865b0c4da5
072865b0c549b

Additional info (snapd from snappy-dev/edge ppa; same problem with the 2.32.3.2 package in 16.04 though)

➜ snap version
snap    2.32.9+git724.fcca9f3~ubuntu16.04.1
snapd   2.32.9+git724.fcca9f3~ubuntu16.04.1
series  16
neon    16.04
kernel  4.13.0-41-generic

The two files in http://people.ubuntu.com/~apachelogger/snap/okular-seccomp+apparmor.tar

@sitter - the /tmp business is most likely because okular uses a per-snap mountspace (ie, per-snap /tmp) that is not in the same namespace as the thing it is trying to connect to. Without knowing the details of the printing system okular is trying to use, it appears that okular is talking to some printing-daemon thing, that thing says to use 'this MAP_SHARED fd I created with open("/tmp", O_TMPFILE...)", it tries to use it but the kernel notices that okular’s mount namespace is not the same as the printing-daemon thing, and things start to fall apart. Note, this is speculation but a place to start digging.

Is it possible to make okular and the ‘printing-daemon thing’ to use a different directory than /tmp? As a ‘quick test’, perhaps try /var/tmp then update /var/lib/snapd/apparmor/profiles/snap.okular.okular to have ‘owner /var/tmp/#* r,’ (then run ‘sudo apparmor_parser -r /var/lib/snapd/apparmor/profiles/snap.okular.okular’ - note you’ll have to re-add this rule after each reboot, refresh, install, etc (it’s only meant to be a data point to understand the issue)).

@thresh - that is interesting. Debian is supposed to use partial confinement so you shouldn’t be seeing this. Let’s go back to the other topic since this appears to be a different issue.

This should’ve been ‘owner /var/tmp/#* rwl,’

Well, the daemon is cups :wink: I’ve had a look and you are right, the tmpfiles seen by ls are simply for the communication with cups. So, entirely unrelated and can be ignored.

Anyway, I’ve had a bit of a dig in the Qt code and found this: The open and linkat are part of the “materializing” of the file as described here https://github.com/qt/qtbase/commit/189e9c93d7ed42202ad51507c8944d64e9a7888d

The actual Qt call chain looks something like this

The ultimate result ought be that there is only one temporary file inode, which is not actually stored under the fileName(), it only gets linkat'd to the fileName() location upon request.

For prosperity, I’ve also done a strace of what I am doing (starting okular with a PDF, and then trying to print that) and that contains the following:

8382  open("/tmp", O_RDWR|O_DIRECTORY|O_CLOEXEC|O_TMPFILE) = 15
8382  linkat(AT_FDCWD, "/proc/self/fd/15", AT_FDCWD, "/tmp/okular_DHqFKd.ps", AT_SYMLINK_FOLLOW) = -1 ENOENT (No such file or directory)
8382  linkat(AT_FDCWD, "/proc/self/fd/15", AT_FDCWD, "/tmp/okular_DaJTBL.ps", AT_SYMLINK_FOLLOW) = -1 ENOENT (No such file or directory)
8382  linkat(AT_FDCWD, "/proc/self/fd/15", AT_FDCWD, "/tmp/okular_QnvnbA.ps", AT_SYMLINK_FOLLOW) = -1 ENOENT (No such file or directory)
8382  linkat(AT_FDCWD, "/proc/self/fd/15", AT_FDCWD, "/tmp/okular_hHxoLk.ps", AT_SYMLINK_FOLLOW) = -1 ENOENT (No such file or directory)
8382  linkat(AT_FDCWD, "/proc/self/fd/15", AT_FDCWD, "/tmp/okular_qFjAEq.ps", AT_SYMLINK_FOLLOW) = -1 ENOENT (No such file or directory)
8382  linkat(AT_FDCWD, "/proc/self/fd/15", AT_FDCWD, "/tmp/okular_NzFMML.ps", AT_SYMLINK_FOLLOW) = -1 ENOENT (No such file or directory)
8382  linkat(AT_FDCWD, "/proc/self/fd/15", AT_FDCWD, "/tmp/okular_viduJN.ps", AT_SYMLINK_FOLLOW) = -1 ENOENT (No such file or directory)
8382  linkat(AT_FDCWD, "/proc/self/fd/15", AT_FDCWD, "/tmp/okular_iWdvZs.ps", AT_SYMLINK_FOLLOW) = -1 ENOENT (No such file or directory)
8382  linkat(AT_FDCWD, "/proc/self/fd/15", AT_FDCWD, "/tmp/okular_hYfBXW.ps", AT_SYMLINK_FOLLOW) = -1 ENOENT (No such file or directory)
8382  linkat(AT_FDCWD, "/proc/self/fd/15", AT_FDCWD, "/tmp/okular_TmhZqd.ps", AT_SYMLINK_FOLLOW) = -1 ENOENT (No such file or directory)
8382  linkat(AT_FDCWD, "/proc/self/fd/15", AT_FDCWD, "/tmp/okular_KGCKoJ.ps", AT_SYMLINK_FOLLOW) = -1 ENOENT (No such file or directory)
8382  linkat(AT_FDCWD, "/proc/self/fd/15", AT_FDCWD, "/tmp/okular_XPXyCC.ps", AT_SYMLINK_FOLLOW) = -1 ENOENT (No such file or directory)
8382  linkat(AT_FDCWD, "/proc/self/fd/15", AT_FDCWD, "/tmp/okular_HwUsXX.ps", AT_SYMLINK_FOLLOW) = -1 ENOENT (No such file or directory)
8382  linkat(AT_FDCWD, "/proc/self/fd/15", AT_FDCWD, "/tmp/okular_yxXeQg.ps", AT_SYMLINK_FOLLOW) = -1 ENOENT (No such file or directory)
8382  linkat(AT_FDCWD, "/proc/self/fd/15", AT_FDCWD, "/tmp/okular_kAnGtT.ps", AT_SYMLINK_FOLLOW) = -1 ENOENT (No such file or directory)
8382  linkat(AT_FDCWD, "/proc/self/fd/15", AT_FDCWD, "/tmp/okular_MoDmhZ.ps", AT_SYMLINK_FOLLOW) = -1 ENOENT (No such file or directory)

So clearly open comes back with fd 15, yet linking fails. QTemporaryFile basically tries a bunch of names to recover from potential name clashes until it eventually gives up which is what then renders the temporary file handling broken as it failed to materialize the file.

Could it be that the hardlinking is indeed simply failing on the O_TMPFILE fd as not being accessible or something?

The good news is I now have a minimal example of the problem https://github.com/apachelogger/tmp-linkat

I can try the tmp directory change if you still think that’s useful, but I think it’s safe to say that the problem is more with the linking than anything else.

[full strace http://people.ubuntu.com/~apachelogger/snap/okular.trace.xz]

This is nice work, thanks! Let me dig in on your minimal example.

No, I confirmed this has nothing to do with it by modifying your program (this is Ubuntu 18.04):

$ test-tmp-linkat ~/snap/test-tmp-linkat/common/
prefix: /home/jamie/snap/test-tmp-linkat/common/
opened fd: 3 [/proc/self/fd/3]
writing works
linkat failed: No such file or directory

From journalctl:

May 18 12:22:05 localhost audit[7010]: AVC apparmor="DENIED" operation="link" info="Failed name lookup - deleted entry" error=-2 profile="snap.test-tmp-linkat.test-tmp-linkat" name="/home/jamie/snap/test-tmp-linkat/common/#12871268" pid=7010 comm="tmp-linkat" requested_mask="l" denied_mask="l" fsuid=1000 ouid=1000
May 18 12:22:05 localhost audit[7010]: AVC apparmor="DENIED" operation="link" profile="snap.test-tmp-linkat.test-tmp-linkat" name="/home/jamie/snap/test-tmp-linkat/common/okular_DHqFKd.ps" pid=7010 comm="tmp-linkat" requested_mask="l" denied_mask="l" fsuid=1000 ouid=1000 target="/home/jamie/snap/test-tmp-linkat/common/#12871268"
May 18 12:22:05 localhost kernel: audit: type=1400 audit(1526664125.788:107321): apparmor="DENIED" operation="link" info="Failed name lookup - deleted entry" error=-2 profile="snap.test-tmp-linkat.test-tmp-linkat" name="/home/jamie/snap/test-tmp-linkat/common/#12871268" pid=7010 comm="tmp-linkat" requested_mask="l" denied_mask="l" fsuid=1000 ouid=1000
May 18 12:22:05 localhost kernel: audit: type=1400 audit(1526664125.788:107322): apparmor="DENIED" operation="link" profile="snap.test-tmp-linkat.test-tmp-linkat" name="/home/jamie/snap/test-tmp-linkat/common/okular_DHqFKd.ps" pid=7010 comm="tmp-linkat" requested_mask="l" denied_mask="l" fsuid=1000 ouid=1000 target="/home/jamie/snap/test-tmp-linkat/common/#12871268"

I filed https://bugs.launchpad.net/apparmor/+bug/1772097 for this issue.

@sitter - the bug is around the fact that when attach_disconnected, we get an unresolvable denial when using linkat() in the manner Qt 5.10 and your simple reproducer uses. This affects all snaps (strict, devmode and classic) that have apparmor enabled at the kernel level (partial or strict confinement) since all snaps have some sort of apparmor profile that uses attach_disconnected. As such, there is no current workaround in snapd or apparmor that we can employ.

While apparmor upstream will look at this issue, once a fix is found it is not going to be able to quickly appear in all distros that snapd runs on. The only real option is to have Qt revert to its old behavior in some fashion in a point release of Qt 5.10 and later. I’m not sure if this is a compile time option or something that could be detected by Qt at runtime (eg, via an environment variable that snapd can set).

The impact could perhaps be substantially lessened if Qt has a patch that can be applied to the Ubuntu archive (for stage-packages) or to a snapcraft part that pulls from another PPA/similar; in this manner, most people that just want to consume Qt in their snaps can do so and have it just work.

Or perhaps just your content snap…

Looking at:

    if (!qt_haveLinuxProcfs())
        return CreateUnnamedFileStatus::NotSupported;

It would be easy enough to inject a function here like:

    if (getenv("SNAP") != NULL)
        return CreateUnnamedFileStatus::NotSupported;

This may not be what the best workaround should be, but I think a quick and dirty hack with something like the above would do the trick.

@jdstrand Thanks!

I’ve applied the workaround to our Qt and that indeed seems to have done the trick for the temporary files. In fact, right now I can’t see anything going massively wrong, so I am guessing that was the only problem with 5.10 \o/

For future reference https://packaging.neon.kde.org/qt/qtbase.git/commit/?h=Neon/release&id=13e45db62e9f7bcf9703858eb7d55953cdfb4c36

1 Like

Thanks for testing @sitter!

In the apparmor bug it was mentioned that the ‘mediate_deleted’ profile flag might workaround the issue. I’ve tested this and it in fact does. When ‘mediate_deleted’ is used, apparmor will use the last known name for the fd (eg, what the kernel shows in /proc/<pid>/fd/<fd> for the ‘(deleted)’ fd) and mediate that. As much as I don’t want to use mediate_deleted, I feel we need to considering that the open/unlink/linkat method is documented in ‘man 2 open’ and other snaps (eg, vlc, see Qt5 KDE integration in VLC snap is broken due to linkat() denial) are using it (at least until a proper upstream fix is found and used everywhere). While this makes this open/unlink/linkat technique work, it does mean that a process from the snap will be able to open deleted files with open fds via the /proc interface. We could limit the scope of this somewhat by preserving the current default but somehow providing a mechanism for the snap developer to opt into using mediate_deleted.

https://github.com/snapcore/snapd/pull/5194

@sitter - when the above is in a stable snapd release (likely 2.33) then you can drop your patch to Qt 5.10.

Yay. Thanks :slight_smile: