Mastodawn

Zack Weinberg Apr 2, 2024

It's late enough to be hacker hours, if you're as old as I am. Gonna write down a bunch of rambly thoughts about #xz and #autoconf and capital-F Free Software sustainability and all that jazz. Plan is to edit it into a Proper Blog Post™ tomorrow. Rest of the thread will be unlisted but boosts and responses are encouraged.

Show thread

Zack Weinberg Apr 2, 2024

Starting with the very specific: I do not think it was an accident that the xz backdoor's exploit chain started with a modified version of a third party .m4 file to be compiled into xz's configure script.

It's possible to write incomprehensible, underhanded code in any programming language. There's competitions for it, even. But when you have a programming language, or perhaps a mashup of two languages, that everyone *expects* not to be able to understand — no matter how careful the author is — well, then you have what we might call an attractive nuisance. And when blobs of code in that language are passed around in copy-and-paste fashion without much review or testing or version control, that makes it an even easier target.

So, in my capacity as one of the last few people still keeping autoconf limping along, I'm thinking pretty hard about what could be done to replace its implementation language, and concurrently what could be done to improve development practice for both autoconf and its extensions (the macro archive, gnulib, etc.)

Show thread

Zack Weinberg Apr 2, 2024

Side bar, I know a lot of people are saying "time to scrap autotools for good, everyone should just use cmake/meson/GN/Basel/..." I have a couple different responses to that depending on my mood, but the important one right now is: Do you honestly believe that your replacement of choice is *enough* better, readability wise, that folks will actually review patches to build machinery carefully enough to catch this kind of insider attack?

Show thread

Zack Weinberg Apr 2, 2024

On the subject of implementation language, I have one half-baked idea and one castle in the air.

The half-baked idea is: Suppose ./configure continues to be a shell script, but it ceases to be a *generated* shell script. No more M4. Similarly, the Makefile continues to be a Makefile but it ceases to be generated from Makefile.am. Instead, there is a large library of shell functions and a somewhat smaller library of Make rules that you include and then use.

For ./configure I'm fairly confident it would be possible to do this and remain compatible with POSIX.1-2001 "shell and utilities". (Little known fact: for a long time now, autoconf scripts *do* use shell functions! Internally, wrapped in multiple layers of M4 goo, but still — we haven't insisted on backcompat all the way to System V sh in a long, long time.) For Makefiles I believe it would be necessary to insist on GNU Make.

This would definitely be an improvement on the status quo, but would it be *enough* of one? And would it be less work than migration to something else? (It would be a compatibility break and it would *not* be possible to automate the conversion. Lots of work for everyone no matter what.)

Show thread

Zack Weinberg Apr 2, 2024

Suppose that's not good enough. Bourne shell is still a shitty programming language, and in particular it is really dang hard to read, especially if you're worried about malicious insiders. Which we are.

Now we have another problem. The #1 selling point for autotools vs all other build orchestrators is "no build dependencies if you're working from tarballs," and the only reason that works is you can count on /bin/sh to exist on anything that purports to be Unix. If we want to stop using /bin/sh, we're going to have to make people install something else first, and that something else needs to be a small and stable Twinkie. Python need not apply (sorry, Meson).

What's small and stable enough? Lua is already too large, and at the same time, too limited.

There's one language that's famous for being tiny, flexible, and pleasantly readable once you wrap your head around it: Forth.

If I had investments to live off, I would be sorely tempted to take the next year or so and write my own Forth that was also a shell language and a build orchestrator, and then have a look at rewriting Autoconf in *that.* This is the castle in the air.

Show thread

Josh Triplett

While I completely agree that shell is not a great language to work with, I'll take it over Forth any day. And even more importantly, I think it's important to not introduce a new implementation of a language for this purpose. I'd rather have not just an established language but an established implementation of that language.

If you're already expecting people to have gmake installed, how about using Guile/Scheme, which gmake already has built-in support for? That would be a substantial improvement over shell, and Scheme has a respectable library and macro system with which to provide a variety of helpful functions people can use.

(That said, I think Python is still a reasonable choice as well, and one that many many systems will already have installed.)

Show thread

Zack Weinberg Apr 2, 2024

@josh Huh, you dislike Forth that much? The *only* non-esoteric programming languages I would personally rate as worse than Bourne shell are C shell, DOS batch, VMS DCL, and Tcl, just so you know where I'm coming from here.

I do not actually *like* the idea of requiring GNU Make, and I think Guile is a non-starter for the same reason I don't think Python or Perl are an option: it would make architecture bootstrap worse. I get why you want an established implementation of an established language, but that's very much in tension with "the interpreter for the replacement implementation language for ./configure should not need a ./configure itself".

Show thread

Josh Triplett Apr 2, 2024

I would *much* rather program build systems or anything else in shell than in stack-oriented RPN, yes. And that aside, I think Forth or any other non-everyday language fails in a similar fashion to m4: most people will treat it as "you are not expected to understand (or review) this" magic to be copy-pasted, precisely *because* the only time most people would encounter it is in this hypothetical successor to autoconf. We should strive for code that's easy and inviting to review using skills many people already have from non-build-system code.

And I don't think we should make everyone's experience with build systems worse just to make architecture bootstrap marginally easier. Let's find a solution for architecture bootstrap, and then let people write Python. If that means we need to ensure we can build Python without Python, or handle Python via build-an-old-version-then-successively-newer-versions, so be it.

Show thread

Zack Weinberg Apr 3, 2024

@josh That's a really good point about languages people already know. I suppose we could try to define a subset of a fixed older version of Python (3.6 or so) that was sufficient to run Meson. A Scheme subset as suggested in another branch (GNU Mes) seems like it would be less work, though.

Show thread

Josh Triplett Apr 3, 2024

I don't think it's sustainable to force an older version of anything, except for bootstrapping (e.g. use old Python to build new Python).

Scheme would be less work for bootstrapping but more work for users (since most users won't be using Scheme for anything else). I'd rather have less work for users, which means more potential reviewers.

It's challenging to make Python code opaque, and doing so is automatically suspicious.

(As an aside, when I'm saying "Python" I'm not automatically assuming "to run Meson". Meson has some nice properties, but is not the end-all be-all of build systems, and seems far too opinionated to be a universal build system for everyone.)

Show thread

Josh Triplett Apr 3, 2024

(Using Mes to bootstrap the world is still a good idea, though. But the tools used for bootstrap don't need to be the same as the tools used for day-to-day builds.)