Looking at the reports of some systems failing to boot after the latest UEFI DBX update and wondering whether it's another case of
https://mjg59.dreamwidth.org/22855.htmlmjg59 | Samsung laptop bug is not Linux specific
We ended up writing some hilariously crappy workarounds for Linux to prevent this kind of thing, where firmware fails to boot if the UEFI variable store is too full. We check whether the write would leave under 5KB of free space (as reported by the firmware), and refuse the write if it would. Easy! Except some firmware would never actually increase the available space count if a variable was deleted, so after a while we'd just never be able to write any more variables
I can't remember how I figured this out, but the affected machines would trigger garbage collection if we tried to create a variable bigger than the available space, so Linux just tries to create a giant variable and then deletes it again to force the firmware to actually update the free space counter
@mjg59 Fucking UEFI. The ring -1 (or is it -42 now? does anyone even know or care?) shit nobody asked for.
@dalias @mjg59 UEFI and ACPI solve the problem of booting the same image on a wide variety of machines. From a distro point of view, that's a huge win. Whether it is worth the huge number of tradeoffs is another question.
@alwayscurious @mjg59 Device tree solves that problem. UEFI and ACPI both address a much larger-scope problem (which a lot of us don't want) of having a persistent layer under your trusted OS that you also have to trust, that continues execution after control was supposed to be passed to the OS, and that the OS is forced to interact with to access important functionality.
@dalias @mjg59 Device tree would be a solution if all of the board-specific code reached mainline, but it doesn't. See
@mjg59's commentary on the subject.
@alwayscurious @mjg59 Sorry, I wasn't clear. I don't mean it's a way you can do things right now. I mean the problem of "booting same image on diverse hardware" is a much smaller-scope problem domain than what UEFI and ACPI do ("abstraction layer/runtime under your trusted OS kernel"). And some of us are very very unhappy that we're expected to accept the latter in order to get the former.
@dalias @alwayscurious @mjg59 yeah we've had enough ACPI problems in recent years that we're starting to see that as a serious problem
@dalias @alwayscurious @mjg59 ACPI was a big barrier to Linux compatibility back in the 90s, because manufacturers only cared about Microsoft so Linux had to reverse-engineer every single thing and figure out how to do it in an open context
@dalias @alwayscurious @mjg59 then everyone basically got their shit together and there wasn't a ton of new stuff that was actually critical, for about twenty years
@dalias @alwayscurious @mjg59 ... but just last month we ran into a problem where Linux is compatible with a motherboard and an m.2 card, but not when they're used together, because this new super-ACPI thing has them talking directly to each other in opaque, non-standard ways and Intel and AMD are at war