Android support in snapd

+1 to @ogra’s proposal. Seems to me the simpler and more practical approach.

Although I wonder how easy would be to make the boot image a second stage bootloader, so we never had to change it as the kernel would be part of the system image. The Android boot image “format” defines where a second stage bootloader would reside, although I doubt that has had any usage.

What exactly would this mean? Does this mean a chain of 2 bootloaders, can you explain in some detail what this would look like?

Yes, was thinking about chaining bootloaders and have the kernel in the syste image. But more of a thought for the future that something we would do now.

chainloading is a nice idea, but has some drawbacks:

  • slower boot, but this might only be in the realm of seconds
  • potentially not carrying over devicetree data
  • potentially breaking device-tree overlay handling (if thats even a thing at all on android based devices)

i’m not much concerned about the speed issue but the other two could bite badly. do we know if there are any android style boards/devices that require dtb overlays to be loaded dynamically on boot ?

also note that an android boot actually usually chainloads by default:
ROM->lk->boot.img

so technically adding one more step to the chain should indeed not be an issue. we actually use that on the dragonboard images where we do:
ROM->lk->uboot.img

…to then just load vmlinuz/intrd.img

I propose to change the name we use in snapd to “aboot” or “lk”, as “fastboot” is not really how the bootloader is named.

1 Like

Indeed boot time is not big issue with chain loading.
It’s more what do we chain load? Finding working uboot might be challenge for some Android platforms.
Android devices do use dtb, and that’s where it gets complicated some devices have dtb packed in boot.img some have it separate partition. So again more trouble how to handle when dtb is residing in different partition. If I remember correctly we saw Android devices which had dtb partition used by boot and recovery, or even devices where kernel had own partition reused by boot and recovery. System LSI usually does those weird setups. I’d be in favour of not supporting those special cases for now.

Hopefully in short time we will not need any of this as we will get better support from android bootloader with Android 7 adoptions. Then we will have two boot images to alternate + recovery

1 Like

We can’t predict which bootloader is used on those devices. aboot, lk or u-boot, all are possible variants which have implementations of the fastboot protocol. The only thing all have in common is the protocol. I am fine giving this a different name but we should keep the actual bootloader names out of the picture.

We are not using the fastboot protocol at all here, in any case. That would happen only if we were flashing externally the device.

I like aboot now more than lk, as a short name for “android style boot”, as what we really require for this design is not a concrete bootloader, but having boot/recovery/system partitions and some android kernel patches.

True and that is why I am fine with giving this a different name than “fastboot”.

That is problematic. aboot (very first Android bootloader used on the G1) refers to specific bootloader implementation and lk (which most vendors fork and base their bootloader on) does too. Both names are with that not applicable for a common implementation of an Android style boot, altough http://newandroidbook.com/Articles/aboot.html refers to a lk based implementation of aboot. Also there are other implementations like from Samsung. This makes it really hard for us to figure a good name. Most common things are “boot.img” and “android”. The boot.img is what we process and then dd to specific partition. What about simply calling it “android” or “android-boot”?

Let’s expand aboot to android-boot. Probably confusion with something else would not be such a big deal, but it will be good to be more descriptive.

3 Likes

I have started to implement the snapd changes for this (starting from initial code from Simon). We need two things there:

  1. A new bootloader. I have chosen to have a very simple configuration file that contains the variables (snap_mode, snap_try_{kernel,core}, snap_{kernel,core}. This configuration file will be written in a folder where the recovery partition is mounted, so it can be accessed after booting to recovery (similar to u-boot way). There is no need to modify boot.img from Core, and we can include abootimg binary only in recovery.
  2. A way to reboot to recovery when kernel or core are refreshed. This is done in Android by setting “recovery” as argument to the reboot syscall. That can be done from the command line. In Touch, after an OTA, the system was rebooted with command “/sbin/reboot -f recovery”, which immediately rebooted the system. What snapd does is running “shutdown +10 -r”. I think an argument can be added there too. But @pedronis said this was going to be changed. In what way?
1 Like

mostly moved to a different place in daemon.go, anyway given that this involve a new booloader what we could do from that new place is invoke a Reboot method to add to the bootloader interface

2 Likes

we will also need:

  • a patch for the kernel to go to recovery in all panic situations
  • porting of the update/rollback script logic to recovery (preferably by just having additional scripts in initramfs-tools-ubuntu-core and generating an additional recovery initrd during initrd creation that a recovery.img can consume)
2 Likes

This idea came to me this weekend (thanks @ogra for the discussion on Friday):

Why instead of considering the boot partition where the “normal” boot kernel resides, and recovery partition where we reboot when we have pending upgrades, we always reboot to recovery from the boot partition initramfs scripts? So we always follow this sequence, either in power on or in reboot:

boot → recovery → userdata

The boot partition would contain the scripts for upgrading kernel/core snaps, and will run them if needed. Regardless of whether upgrading or not, it will do a “reboot recovery”.

The recovery partition would do the normal boot process, starting systemd init in userdata partition.

To all effects, boot partition would be a second stage bootloader, and would have exactly the same functionality we have for u-boot and grub. Cases like refreshing the core snap, then powering off instead of rebooting the device, and then powering on, would work smoothly as in uboot/grub.

When upgrading the kernel snap, the partition to refresh would be recovery, and we would not touch the boot partition.

The drawback all this has is obviously a longer boot time, as we are going through bootloader+kernel load+start to initramfs twice. This takes around 30 additional seconds. It is noticeable, but we can assume this in this sort of IoT devices that are not rebooting that often.

The advantage is a sane upgrade process, fulfilling snap promises :wink:

Wdyt?

3 Likes

this is a really beautiful idea and goes hand in hand with how i imagined to always boot to recovery … switching the partitions instead is indeed brilliant and saves a lot of code changes.

while the additional reboot indeed costs time i think we can get far below the 30sec here, the kernel we use as second stage bootloader only needs to have enough enabled to find the other partition. it can be very tiny, the same goes for the initrd script that holds the rollback logic … my gut feeling is that we could come out below 10sec with this.

Adding support to klibc’s reboot command done: https://bugs.launchpad.net/ubuntu/+source/klibc/+bug/1692494

Alright, I’d like to understand this a bit better from Alfonso. I’ll hold a separate live conversation so I can learn more details. I’m good with the overall concept but I do think it’s important to keep the boot/reboot time as small as possible. There are still many use cases where certain IoT boards won’t be battery backed-up and so on a brown out or a short power loss, it’d be very nice for the time to be as short as possible so that service interruption isn’t very noticeable.

Ok, I better understand this now and am on board with it.

This sounds pretty interesting. It’s slightly surprising to get 30 additional seconds just because we’re going through another kernel, though. Why is taking so long to simply chainload into another kernel?

this really depends on the type of boootloader in use and how much time it takes before loading the boot.img partition, we can definitely optimize the whole kernel/initrd side down to a few seconds but a reboot includes bits we dont control (the bootloader itself), that adds to the process.

with this setup you simply add one extra reboot call every time you boot.

the robustness of the functionality gained by using such a simplified setup pays off though and is worth every extra second it adds IMHO.