It's late enough to be hacker hours, if you're as old as I am. Gonna write down a bunch of rambly thoughts about #xz and #autoconf and capital-F Free Software sustainability and all that jazz. Plan is to edit it into a Proper Blog Post™ tomorrow. Rest of the thread will be unlisted but boosts and responses are encouraged.

Starting with the very specific: I do not think it was an accident that the xz backdoor's exploit chain started with a modified version of a third party .m4 file to be compiled into xz's configure script.

It's possible to write incomprehensible, underhanded code in any programming language. There's competitions for it, even. But when you have a programming language, or perhaps a mashup of two languages, that everyone *expects* not to be able to understand — no matter how careful the author is — well, then you have what we might call an attractive nuisance. And when blobs of code in that language are passed around in copy-and-paste fashion without much review or testing or version control, that makes it an even easier target.

So, in my capacity as one of the last few people still keeping autoconf limping along, I'm thinking pretty hard about what could be done to replace its implementation language, and concurrently what could be done to improve development practice for both autoconf and its extensions (the macro archive, gnulib, etc.)

Side bar, I know a lot of people are saying "time to scrap autotools for good, everyone should just use cmake/meson/GN/Basel/..." I have a couple different responses to that depending on my mood, but the important one right now is: Do you honestly believe that your replacement of choice is *enough* better, readability wise, that folks will actually review patches to build machinery carefully enough to catch this kind of insider attack?

On the subject of implementation language, I have one half-baked idea and one castle in the air.

The half-baked idea is: Suppose ./configure continues to be a shell script, but it ceases to be a *generated* shell script. No more M4. Similarly, the Makefile continues to be a Makefile but it ceases to be generated from Makefile.am. Instead, there is a large library of shell functions and a somewhat smaller library of Make rules that you include and then use.

For ./configure I'm fairly confident it would be possible to do this and remain compatible with POSIX.1-2001 "shell and utilities". (Little known fact: for a long time now, autoconf scripts *do* use shell functions! Internally, wrapped in multiple layers of M4 goo, but still — we haven't insisted on backcompat all the way to System V sh in a long, long time.) For Makefiles I believe it would be necessary to insist on GNU Make.

This would definitely be an improvement on the status quo, but would it be *enough* of one? And would it be less work than migration to something else? (It would be a compatibility break and it would *not* be possible to automate the conversion. Lots of work for everyone no matter what.)

Suppose that's not good enough. Bourne shell is still a shitty programming language, and in particular it is really dang hard to read, especially if you're worried about malicious insiders. Which we are.

Now we have another problem. The #1 selling point for autotools vs all other build orchestrators is "no build dependencies if you're working from tarballs," and the only reason that works is you can count on /bin/sh to exist on anything that purports to be Unix. If we want to stop using /bin/sh, we're going to have to make people install something else first, and that something else needs to be a small and stable Twinkie. Python need not apply (sorry, Meson).

What's small and stable enough? Lua is already too large, and at the same time, too limited.

There's one language that's famous for being tiny, flexible, and pleasantly readable once you wrap your head around it: Forth.

If I had investments to live off, I would be sorely tempted to take the next year or so and write my own Forth that was also a shell language and a build orchestrator, and then have a look at rewriting Autoconf in *that.* This is the castle in the air.

@zwol In #bootstrapping circles, we have GNU Mes and Gash (the combination of which is good enough to run ./configure scripts).

The Racket folks have switched to Zuo as their build system, also based on a minimal Scheme implementation.

Maybe not a universal option, but I can imagine a build system based on Mes/Zuo, at least in the circles I care about.

@civodul Hum the attack would be a bit more sophisticated for GNU Mes and Gash as implemented in Guix. But still…

Instead of targeting plain Bash, one needs to target the Guix package ’guile-bootstrap’. This package depends on tar, bash, mkdir and xz; it adds some surface.

Else, it would also be possible to exploit the non-deterministic Gash compilation to hide stuff.

https://simon.tournier.info/posts/2023-10-01-bootstrapping.html

The attack would be much more complicated, I guess.

@zwol

Is Guix full-source bootstrap a lie?

@zimoun @civodul I thought about it some more and absolute size is not the most important issue here; the most important issues are (1) how difficult is it to install the thing, and (2) how much more readable than sh(+m4)+make do you get for the effort. That said, size does matter in that someone might want to audit the language they're being asked to install, on top of everything else. And the big popular interpreted languages tend to have large dependency graphs, which makes their true size even bigger, makes them harder to install, and makes problems for bootstrapping.

Python and Perl are very large (current releases are ~1.2M lines of code each according to SLOCCount), nontrivial to install from source, and problematic at the lowest levels of the bootstrap chain.

A mostly complete implementation of POSIX shell and utils, namely busybox, can be fit into 200,000 lines. bash+coreutils has important missing pieces (grep, sed, awk, find, diff are the ones I know about) and is about twice as big.

mes+gash+gash-utils is ~70,000 lines. Lua is ~20,000. Neither Scheme nor Lua feels like *enough* of a readability improvement over sh to be worth the switching costs.

I would say that 20,000 lines of C is about the upper limit for what I'd feel comfortable demanding people install before they can build the thing they actually wanted to build.

Furthermore, any such component cannot require a complex configure+build process itself lest we have a circular dependency.

@zwol @zimoun To be fair, Mes includes a C library, a C compiler with 4 backends, etc. The parts that would matter here are the interpreter, which is ~6K lines of C under src/.

Zuo has an interpreter with ~8K lines of C and ~5K lines of Zuo (Scheme).

This should be compared with the line counts of Perl + Auto{conf,make} + Make or CMake + Make/Ninja.

@zimoun @zwol Speaking of build systems: in 2008, Tom Tromey wrote Quagmire, a proof-of-concept replacement of Autoconf + Automake, mostly compatible with the latter, implemented in GNU Make (~1K lines).

https://tromey.com/blog/?cat=16
https://github.com/tromey/quagmire

It’s appealing because GNU Make is ubiquitous and ‘Quagmire’ files looked very much like ‘Makefile.am’.

The downside is that it’s hard to debug and work with (lots of ‘eval’ tricks…). Less appealing than Zuo or similar to me.

quagmire — The Cliffs of Inanity

@civodul @zimoun Oh, I remember when Tom announced Quagmire. Hard to debug is exactly what we don't want, though — I bet it would be easy to hide a back door in 1000 lines of cryptic Makefile tricks.
@zwol @zimoun @civodul I’ve seen Fossil use Tcl for their configure script and shipping a micro-Tcl called jim that’s a single .c file for when you don’t have Tcl
@zimoun regarding gash in another scheme: that would be solved most beautifully if gash could be run by the Scheme from @janneke ’s MES. @civodul @zwol

@ArneBab Hum, not really from my understanding.

There is a chicken-or-the-egg problem for the "driver" (currently guile-bootstrap). From my understanding, the only option for removing tar, bash, mkdir and xz is to have an implementation directly in binary (hex).

Well, that’s what I detail in the sections:

« Analysing guile-bootstrap derivation »
« Opinionated next steps »

from: https://simon.tournier.info/posts/2023-10-01-bootstrapping.html
@zwol @civodul @janneke

Is Guix full-source bootstrap a lie?

@zimoun I think I understand what you mean: this still needs something to run it.

Maybe avoiding compressions for tarballs could make xz unnecessary. Would it then suffice to have the scheme from mes capable of running as driver?

mes bootstraps from binary, IIRC, so this could alleviate more problems? https://www.gnu.org/software/mes/manual/mes.html#Full-Source-Bootstrap

@zwol @civodul @janneke

GNU Mes Reference Manual

GNU Mes Reference Manual

@zimoun (or is the driver needed for mes itself? ⇒ we’d need a driver built on m2-planet? Or only on parts of mes?) @zwol @civodul @janneke

@ArneBab « Would it then suffice to have the scheme from mes capable of running as driver? »

See Fig.1 https://simon.tournier.info/posts/2023-10-01-bootstrapping.html

The question is how to run ’bootstrap-seeds’, which is stage0 and M2-Planet. We need guile-bootstrap (driver) which relies on helpers (bash, tar, etc.)

Chicken-or-the-egg problem. We need a binary driver to get MES. IMHO, the only option’s binary driver written by hand, somehow.

@janneke @civodul @zwol

Is Guix full-source bootstrap a lie?

@zimoun @ArneBab @zwol @civodul that's a pretty nice writeup.with some valid concerns.