I didnt say fail … the hardware watchdog checks for respose not for failures, a single core CPU like you find them in many routers with low ram and no swap can easily go into stalled IO simply because you can not properly multithread if your CPU does not allow this … a HW watchdog will check if the system still responds and if not it will force-reboot, If you install a postgres DB backed snap service on a beaglebone and teh amount of data gets to big for the hardware to stay responsive, the watchdog will take care for you, this has nothing to do with kernel or rootfs, it is a limitation of the hardware you use and no roll-back will fix the unresponsiveness.
a remote system that you can not reach to manually reset it needs to be able to auto-reset itself, so that you can still reach it to debug, fix and analyze it over the network, even if there was no kernel oops, OOM or whatnot you want this system to not eternally stay unresponsive due to saturation.
There is no db in a bootloader and i was not referring to this at all …
Also, i think @ppisati has put some time into enabling watchdog features in u-boot (though i might admittedly mis-remember this) and as long as the config option is enabled in u-boot it will do exactly this … reboot the device after a kernel hang.
But in any case the original request we discuss here is for the HW watchdog (/dev/watchdog) and enabling it in userspace, which we do not support at all currently. I dont really get what kernel or core rollbacks would achieve here when you are after simply recovering a saturated system.