Testing UC20 with FDE and TPM in QEMU

Hi,

I’ve been trying to run UC20 images in secured grade, using QEMU. I got secure boot to work, then looked at adding a TPM device. To create the swtpm socket device I created a docker container with swtpm, which seemed to basically work.

I then start the qemu process with all the various parameters required, and it seems to go fairly well, it runs through initial setup, encrypts the disks etc. I’m using the following command to start the process:

qemu-system-x86_64 -smp 1 -m 2048 \
 -machine q35,smm=on,accel=kvm \
 -serial file:serial.log \
 -net nic,model=virtio -net user,hostfwd=tcp::8022-:22 \
 -global driver=cfi.pflash01,property=secure,value=on, \
 -drive if=pflash,format=raw,unit=0,readonly=on,file=/usr/share/OVMF/OVMF_CODE_4M.secboot.fd \
 -drive if=pflash,format=raw,unit=1,readonly=on,file=/usr/share/OVMF/OVMF_VARS_4M.ms.fd \
 -drive file=${IMG},cache=none,format=raw,id=disk1,if=none \
 -object rng-random,id=rng0,filename=/dev/urandom -device virtio-rng-pci,rng=rng0 \
 -chardev socket,id=chrtpm,path="${HOME}/tmp/tpm/swtpm-sock" \
 -tpmdev emulator,id=tpm0,chardev=chrtpm \
 -device tpm-tis,tpmdev=tpm0 \
 -device virtio-blk-pci,drive=disk1,bootindex=1 \
 -device virtio-vga,virgl=on,disable-legacy=off,disable-modern=on

But when it then reboots, it fails to decrypt the data volume. I get the following error:

[   92.042410] the-tool[194]: error: cannot activate encrypted device "/dev/disk/by-partuuid/b992f75e-ac04-3d4a-b9cc-06bf1063a30b": cannot activate with TPM sealed key (cannot unseal key: invalid key data file: cannot complete authorization policy assertions: cannot complete OR assertions: current session digest not found in policy data) and activation with recovery key failed (cannot obtain recovery key: /usr/sbin/systemd-ask-password failed: exit status 1)[FAILED] Failed to start the-tool.service.

In case the problem is the swtpm implementation I’m using, after some googling I tried the snap I found swtpm-mvo , and upgrading my stock ubuntu focal version of qemu from 4.x to 6.x , always the same result.

I notice that entropy seems to be very slow to generate, that’s why I added the rng-random qemu params above, but it didn’t seem to help that much. But I kind of doubt that is the cause of the issue.

Does anyone have any idea what may be causing this ? Or, how can I did into the issue a bit more ?

Cheers, Just

@ijohnson Sorry to tag you directly, but during my scouring of the interwebs I noticed you have been using swtpm and QEMU for Ubuntu Core testing :slight_smile: [ https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1903864/comments/29 ]

I wonder if you can offer any insight to the above issue, where vols are encrypted on install but not decypted on reboot ? Anything I’m missing something, or anything you suggest I could try ?

Being able to test our in-house images virtually with FDE would be awesome.

Cheers, Just

What version of snaps are you using to build the image? We do this exact test in our spread tests in the snapd repo, so it definitely does work to boot secured grade images via QEMU with swtpm + secure boot enabled, but indeed getting all the options correct in QEMU is non-obvious to say the least. I don’t see anything in the options you are specifying that would be wrong, but for the record this is the set of options I use locally with QEMU and swtpm:

qemu-system-x86_64 \
    -enable-kvm \
     -smp 1 \
     -m 2048 \
     -machine q35 \
     -cpu host \
     -global ICH9-LPC.disable_s3=1 \
     -netdev user,id=mynet0,hostfwd=tcp::8027-:22 \
     -drive file=/usr/share/OVMF/OVMF_CODE.secboot.fd,if=pflash,format=raw,unit=0,readonly=on \
     -drive file="$SCRIPT_DIR/OVMF_VARS.ms.fd",if=pflash,format=raw,unit=1 \
     -chardev socket,id=chrtpm,path="/var/snap/swtpm-mvo/current/swtpm-sock" -tpmdev emulator,id=tpm0,chardev=chrtpm -device tpm-tis,tpmdev=tpm0 \
     -device virtio-net-pci,netdev=mynet0 \
     -drive "file=uc20.img",if=none,format=raw,id=disk1 \
     -device virtio-blk-pci,drive=disk1,bootindex=1 \
     -serial mon:stdio

where OVMF_VARS.ms.fd is a local copy of the one from /usr/share/OVMF to be able to edit it locally

I presume you are referring to the snaps used in the image ? It’s the stable versions of snapd, core20, and pc-kernel. The gadget is custom, but closely based on the reference pc gadget, and it all works on a physical device.

Thanks for reference config @ijohnson , I will try that verbatim just to be sure I’m not missing something :+1:

Cheers, Just

Did you make sure that the swtpm is reset? The permall file may need to be deleted with something like

sudo rm "/var/snap/swtpm-mvo/current/tpm2-00.permall"

Hmm, very curious, if I use your reference above @ijohnson it works ! Thank you for posting it :+1:

Now I just have to find the difference. I will post back when I find it, as others may hit this issue too, and it could be interesting :slight_smile:

Cheers, Just

Looking back at your set of options, what does this do:

 -global driver=cfi.pflash01,property=secure,value=on, \

?

I think I got that from an older post on the qemu mailing list after initial trouble trying to get secure boot working. I’ve now removed it, as it does’t seem to be required.

I found the difference, although not sure why it is.

I was using the 4M version of the ovmf files. I was use those because I had trouble getting the non-4M versions working, the qemu bios initialisation seemed to hang. I switched to 4M and it worked [ well, apart from the issue I posted about initially - it was working for non secured, so signed ].

It seems to get the non-4M version working I had to add this -global ICH9-LPC.disable_s3=1 , then the secured image boots and sets up the encrypted disks, and on reboot it s able to unseal the key and access the disks properly.

The question is, why did the 4M files break that part ? I was under the impression they something to block size, but I’m not sure how that would make any difference at all to unsealing the key from the tpm ?? If anyone knows anything about that, I’d love to know.

Now onto the next problem is, although initial setup works now. After the first reboot into the proper OS, I found that issuing an OS reboot often hangs and qemu spins at 100% cpu. It’s not 100% consistent, and sometimes works. It’s strange, because the first reboot after setup always seems to work. Shutdown works fine every time. I’ve not noticed it before when not using FDE.

Have noticed that before @ijohnson ?

Cheers, Just

BTW - I’ve been using ovmf-2021.08~rc0-2~backport20.04-202111041420~ubuntu20.04.1 , and qemu-6.0+dfsg-2expubuntu1.1~backport20.04-202111060014~ubuntu20.04.1

Cheers, Just

Hmm, no I have not noticed that sort of issue, but we do see random hangs with our nested VM’s running UC20 on Google Compute Engine as part of our spread test which we have never been able to identify the cause of, our leading guess is hypervisor bugs in the L1 VM in GCE (in this case UC20 is a L2 VM). These VM’s do use encryption with swtpm, etc. as well

I played with lots of different qemu options, and driver params. I even dowgraded qemu to the stock focal version 4.2, no joy, still the same hang on reboot.

Eventually I downgraded ovmf to the stock focal version 0~20191122.bd85bf54-2ubuntu3.3, and seemed to make the difference.

Finally seem have have a working good recipe now :+1:

Thanks for the pointers along the way @ijohnson :slight_smile:

Cheers, Just

1 Like