Robust A/B Bootloader&Recovery Installs and Updates (to interoperable, hardware separated eMMC boot partitions)
This is a newly researched issue for implementing an A/B Bootloader&Recovery scheme on the Librem 5.
Filing it as I've put some work into reading the old devkit issue librem5-devkit-tools#10 "Place u-boot in the boot MMC area" (that one could be closed in favor of this one) and its related work which was never merged, all the while writing down this distilled and overhauled summary.
It's not a completely trivial thing to enable, but still only a small list of rather simple things required to make it all interact properly.
By now, there doesn't seem anything missing anymore in mainline u-boot to go forward with using the hardware partitions for an A/B update scheme (presentation linked below).
[Adding @sebastian.krzyszkowiak here, as author of the most current attempt to start installing the bootloader into a boot1/boot2 hardware partition (librem5-flash-image!24)]
[Adding @guido.gunther and @dorota.czaplejewicz which seem the only remaining active accounts from librem5-devkit-tools#10]
Goals
Features:
- robustness of u-boot updates (i.e. flashed to currently not-booted hardware boot partition, only activated upon write verification),
- u-boot fallback options (the previously-booting hardware boot partition remains unchanged, only inactivated),
- allows for robust installation of an integrated jumpdrive recovery option (into a hardware boot partition), which in turn
- allows for the most basic security measure, i.e. "production images" that have the uuu access disabled by default (without loosing all fallback/recovery options).
- Allows to use GPT partition table (librem5-devkit-tools#7). And thus, finally:
- Allowing for generic OS installations, i.e. allowing u-boot to boot generic distros (e.g. using /boot partition, and plain rootfs with /vmlinuz + /initrd.img links, possibly (U)EFI etc.) and thus become accessible to more distro users and customers.
Definitions and Findings
"A/B Updates" safely update a currently unused partition before "atomically" switching over to start using the updated one, while retaining the possibility to switch back to the previous version.
The A/B switching scheme is well supported by the established eMMC standard's feature of "hardware partitions" (these exist separately from the usual OS accessible, and software partitionable block device that an eMMC provides).
"A/B System Image Updates" are based on partitions that contain and switch between complete OS images.
For an open device, however, that not only allows to be run by different operating systems, but is also usually updated by package managers that allow fine grained switching between multiple kernel as well as general package versions, the basic eMMC features may rather be used to implement...
"A/B Bootloader&Recovery Updates". These provide robust support for the hardware specific booting parts, and a recovery tool that allows direct access to the regular eMMC block device (OS partitions) for (re)installs, backups, fixing misconfiguration etc. (i.e. "jumpdrive").
Operating systems that use package manager based updates won't need any further A/B partitions. Others may still use the A/B scheme based on the OS managed partition table, which can be created and adjusted while staying interoperable with other distributions, instead of resetting the eMMC with resized hardware partitions.
Implementation
The layout for the hardware boot partition would be to place a partition table and an efficient, no-journal, noatime filesystem for the recovery system on it. Locating the filesystem partition between the area where the u-boot binary gets installed in the beginning of the partition and where the u-boot environment is placed at the end. This likely means using a MBR partition table, as it only uses little space at the start with no copy at the end of the partition.
-
For robust updates [EDIT:
one package must provide something like a central flag to switch the active partition only once, e.g. during shutdown. This allows all "flashing" procedures to remove the flag if present before starting to write (invalidating prior verifications) and set this flag after verifying their applied writes. Using a flag instead of directly changing the active partition avoids that subsequent flashing procedures write to the wrong partition due to the active partition having changed before. The flag providing package should also store which partition was active during boot, so it can notice and report if the active partition has already been insecurely changed directly (risk of incomplete or conflicting changes).it is only required for all "flashing" procedures to always copy over the entire active partition to the inactive one before applying their changes, and call (ideally flagging the system) for a reboot. Just because without a reboot, changes made by one procedure (u-boot, u-boot environment, jumpdrive) will be overwritten by any other as long as "unbooted changes" exist in the system (i.e. flagged for reboot). The flag could be implicit by updating a creation date or sequence number when copying over the active boot partition to the inactive one (i.e. reboot flag = seeing a newer inactive partition). -
The uuu flash tool must only write to the inactive hardware boot partition, and activate it afterwards, to prevent all bricking risks when flashing "production" images (which would have uuu access disabled by default) in case flashing gets interrupted. For a "complete factory-reset reflash" the u-boot environment area must be cleared, if not contained in the image.
-
The u-boot .deb package must use the inactive hardware partition's linux device to "flash" and verify directly to it, [EDIT:
including a copy of the u-boot environment from the active partitionafter copying over the entire currently active boot partition. The eMMC hardware partition write access is described at https://docs.kernel.org/driver-api/mmc/mmc-dev-parts.html, best to have the hardware partitions writable only in single-user mode, and otherwise always set them "write protected until re-boot" during boot. If also applying u-boot environment updates, ideally show all "conf-file" additions/removals or conflicts to the user, at least print the diff to syslog to ease fixing potential errors. -
The
saveenv
command in u-boot, and its correspondingfw_saveenv
build in the u-boot-tools package, must copy the entire boot partition to the inactive one, and save the modified environment there, to always allow falling back to a working pre-change u-boot environment. -
A jumpdrive package could "flash" by installing to the mounted filesystem on the inactive hardware partitions (while making it writable during the install), but must consider that versions in the hardware partition may also be installed by other OSes or means.
References
There exists a presentation of the u-boot features for A/B updates, although in the context of complete OS image updates, and not going into using eMMC hardware partitions beyond giving one reference: https://bootlin.com/pub/conferences/2022/elce/opdenacker-implementing-A-B-system-updates-with-u-boot/opdenacker-implementing-A-B-system-updates-with-u-boot.pdf
u-boot mainline way to access eMMC hardware boot partitions: https://narkive.com/rwJ8voQj.13
u-boot security: https://www.timesys.com/security/securing-u-boot-a-guide-to-mitigating-common-attack-vectors/
general eMMC info:
https://www.embeddedartists.com/wp-content/uploads/2020/04/Working_with_eMMC.pdf