Look this isn’t at all a defense of slop code, but it has me thinking — how much does code quality matter, and why?

It’s maintenance, right? We care about readability because we know we’ll have to make changes, fix bugs, etc.

But so … imagine a codebase that’s magically bug-free and feature-complete. (I’m aware this is a strawman - that’s the point, it’s a thought experiment.) Does it matter if this codebase is well-written? I’m not sure it does! (1/5)

Code quality has always been ONE factor; it’s never been always the most important. Eg we often accept complex internals as the price for a clean external API; and we all write sloppy code for one-offs, prototypes, etc. So part of me accepts the “code quality doesn’t matter” argument. I can see a vision of agentic engineering with systems that prove correctness; if an agent produces code that is provably correct, maybe the quality really doesn’t matter! (2/5)
I’m far from convinced that this is actually possible. It’s certainly not now — and I’m not talking about models. Testing and verification tools are nowhere near where they’d need to be, regardless of model quality. Today, code quality DOES still matter; even the best-case version of agentic engineering can’t produce code that’ll never require maintenance. But I can see a possible future where code quality might not matter, or will matter a lot less, and that’s FASCINATING. (3/5)
Specifically what I find fascinating is: the tooling that would be required to make agentic engineering begin to live up to the hype — much better testing tools, formal business logic specification languages, more powerful and easier to use formal verification tools, better static analysis tooling, etc — would be massively useful to software engineering quite regardless of the existence/utility/quality of LLMs. (4/5)
Will we actually build them? I sort of doubt it: the history of software development, and of course the current trajectory, suggests we’ll continue to yolo our way through it. I wouldn’t exactly say I’m optimistic, but hope springs eternal. (5/5)

@jacob Responding to the thread as a whole, I think code readability will matter just as much as cockpit black boxes matter.

The universe is against us even with bug-free hardware and software. Even the perfect self-driving car will kill some people. The perfect AI doctor will lose some patients. The perfect automated factory will assemble some lemons.

We will need to be able to learn why and how it happened. Step by step. So, logs and readable code. Otherwise trusting automation is impossible

@ambv @jacob Definitely...

I was going to say something about accountability, but this is pretty much my thought too.

Especially as it's incredibly unlikely that anyone would one-shot a perfect implementation, even more so for something that is extremely critical.

If we can't understand "why" something went wrong, how do we account for it? Was it negligence? Can someone be held accountable? Is the prompting wrong or is the code wrong?

I can see why that's a dream scenario for a corporation, though.

@pythonbynight @ambv Ok yes I like this: “explainabibility” (I need a better word but fine) is one reason why code quality matters. Like, if a dam fails, we’d really like to have the blueprints — and for them to be readable! — so we can figure out where we screwed up. “The dam works because the lake holds water” isn’t sufficient.
@jacob @pythonbynight @ambv "Explainability" is the word generally used for this requirement when it comes to the use of machine learning systems in finance (loan approvals, that kind of thing), so if there's a better word, it hasn't been found yet.
@ancoghlan @jacob @pythonbynight @ambv there's also "legibility", in the Seeing Like a State sense, ie. you have people (or LLMs) who report to you, and you need to be able to make sense of what they're doing in order to hold them accountable.
@jacob I think this is a subtle category error, in the sense that "readability" (and "code quality" more generally) is a transitive adjective. Readable to whom? High quality according to whose taste? We strive for an "objective" sense of code quality because the audience we are usually addressing is the pool of potential candidates who may become future maintainers of the code, and that's a nebulous group. But, that group's nebulousness eventually must be removed as it becomes "the current team"
@jacob all of that is to say: the reason readability matters is that if you want to maintain some code, some specific group of humans must maintain an understanding of its structure, such that they can effectuate *and be held accountable for* required changes. The putative cost reduction of an "agentic" tool inherently assumes that you can shrink that group by replacing some of them with LLMs. Maybe, eventually, some will be able to. But ultimately *somebody* still needs to read all the code.
@jacob the world in which agentic tools could have the level of success that you're imagining with trash-level code quality would seem to me to be the same world where anyone annoyed with a subscription-model app would simply download a 30-year-old abandonware replacement from archive dot org, because they'd be comfortable with long-term stasis. which seems unrealistic to me.