Easy way to debug UC20 install mode (by modifying kernel command line)

ijohnson · September 15, 2021, 4:50pm

It’s come up before about how to debug things going wrong on UC20 install mode and I think I’ve found an easy way to get a debug shell on a device without modifying anything too drastically. All that you really need to modify is the gadget snap by adding (maybe) two files:

install-device hook to sleep forever and never exit
cmdline.full to modify the kernel command line - note that this file is only effective/used for amd64/grub devices, if you are on an ARM platform you will have to modify the kernel command line yourself some other way

And then rebuild the gadget snap, rebuild your image with the new gadget snap and flash it to the device.

We are going to use two tricks here. The install-device hook will block infinitely for us if for some reason install mode does continue successfully but we just want to poke around at the system at this state (for example of when you want to do this might be if you are debugging your actual install-device hook and you want a real shell to run things in rather than just modifying your hook and re-flashing and waiting, etc.). The second trick is to use systemd’s kernel command line interface to disable the service which takes over a given TTY, and instead we are going to put a debug shell on this TTY so we are auto-logged in via root on this TTY. Obviously this is only development and should never be left enabled in production.

So to set this up, use something like the following for your install-device hook:

#!/bin/sh

sleep infinity

and then put this into the cmdline.full file for your gadget snap (or otherwise modify the kernel command line to replace it with at least these things:

# this is the first serial console, pick a suitable one for your device
console=ttyS0 

# debug things to get more output from snapd
snapd.debug=1

# these will output more device status/logging to our console,
# which is kinda annoying in that it goes to the same place we 
# have our shell, but I think it's more useful to be able to see 
# than have a perfectly pristine console, if you have multiple
# consoles you could of course use a different console for
# systemd output and your debug shell
rd.systemd.journald.forward_to_console=1
systemd.journald.forward_to_console=1

# to disable console-conf from running on ttyS0 so we can setup
# our debug shell there, this is the most key part here, without
# this console-conf will kill/prevent from starting our debug shell
systemd.mask=serial-getty@ttyS0.service

# create our debug shell on ttyS0
systemd.debug-shell=ttyS0

# needed in order to allow the debug-shell to run on UC20
dangerous

# standard Ubuntu Core kernel command line parameters
panic=-1

The reason we use cmdline.full instead of cmdline.extra is that we want to change the default console settings (at least on amd64 in the default pc gadget).

After booting your device up, it will be stuck in install mode (either because install mode fails due to a gadget.yaml error or something, or it will be stuck infinitely running the install-device hook), and you will have a root shell you use for debugging and all you had to modify was the kernel command line and the gadget snap.

Hope this helps.

Thanks,
Ian

mborzecki · September 16, 2021, 6:37am

IIRC you also need to pass dangerous in the command line for systemd.debug-shell unit to be activated.

ijohnson · September 16, 2021, 1:54pm

Ah yes you’re right I thought that was just for the initrd, but it is true for userspace as well. Updated the doc