Agreed with everything @kevinriggle wrote here. Another angle on this to try with people who simply do not understand what software engineering •is•: “What’s the impact on the other 7/8?”

AI can generate code fast. Often it’s correct. Often it’s not, but close. Often it’s totally wrong. Often it’s •close• to correct and even appears to work, but has subtle errors that must be corrected and may be hard to detect.

1/ https://ioc.exchange/@kevinriggle/113641234199724146

Kevin Riggle (@[email protected])

What I’m taking from this is that software engineers spend most of our time on engineering software, and writing code is (as expected) a relatively small portion of that work. Imagine this for other engineering disciplines. “Wow structural engineers seem to spend most of their time on meetings and CAD and relatively little time physically building bridges with their hands! This is something AI can and should fix. I am very smart” https://fortune.com/2024/12/05/amazon-developers-spend-hour-per-day-coding/

IOC.exchange

All the above is also true (though perhaps in different proportions) of humans writing code! But here’s the big difference:

When humans write the code, those humans are •thinking• about the problem the whole time: understanding where those flaws might be hiding, playing out the implications of business assumptions, studying the problem up close.

When AI write the code, none of that happens. It’s a tradeoff: faster code generation at the cost of reduced understanding.

2/

The effect of AI is to reduce the cost of •generating code• by a factor of X at the cost of increasing the cost of •thinking about the problem• by a factor of Y.

And yes, Y>1. A thing non-developers do not understand about code is that coding a solution is a deep way of understanding a problem — and conversely, using code that’s dropped in your lap greatly increases the amount of problem that must be understood.

3/

Increase the cost of generating code by a factor of X; increase the cost of understanding by a factor of Y. How much bigger must X be than Y for that to pay off?

Check that OP again: if a software engs spend on average 1 hr/day writing code, and assuming (optimistically!) that they only work 8 hr days, then a napkin sketch of your AI-assisted cost of coding is:

1 / X + 7 * Y

That means even if X = ∞ (and it doesnt, but even if!!), then Y cannot exceed ~1.14.

Hey CXO, you want that bet?

4/

This is a silly thumbnail-sized model, and it’s full of all kinds of holes.

Maybe devs straight-up waste 3 hours a day, so then payoff is Y < 1.25 instead! Maybe the effects are complex and nonlinear! Maybe this whole quantification effort is doomed!

Don’t take my math too seriously. I’m not actually setting up a useful predictive model here; I’m making a point.

5/

Though my model is quantitatively silly, it does get at the heart of something all too real:

If you see the OP and think it means software development is on the cusp of being automatable because devs only spend ≤1/8 of their time actually typing code, you’d damn well better understand how they spend the other ≥7/8 of their time — and how your executive decisions, necessarily made from a position of ignorance* if you are an executive, impact that 7/8.

/end (with footnote)

* Yes, executive decisions are •necessarily• made from a position of ignorance. The point of having these high-level roles isn’t (or at least should not be) amassing power, but rather having people looking at things at different zoom levels. Summary is at the heart of good management, along with the humility to know that you are seeing things in summary. If you know all the details, you’re insufficiently zoomed out. If you’re zoomed out, you have to remember how many details you don’t know.

Yes, this from @sethrichards is what I’m talking about:
https://mas.to/@sethrichards/113642032055823958

I had a great moment with a student the other day dealing with some broken code. The proximate problem was a Java NullPointerException. The proximate solution was “that ivar isn’t initialized yet.”

BUT…

Seth Richards (@[email protected])

@[email protected] I agree with all of this, and I'd add: When I'm writing code, I'm *learning* about the problem as well through the process. When I fix a bug in my code, I (hopefully) learn not to make the same mistake again. When I help someone on the team fix a bug in their code, we both learn something. If we write documentation or a unit test to make sure the bug doesn't happen again, the organization "learns" something too. It's unclear to me whether AI is even capable of learning in this way.

mas.to

…It wasn’t initialized because they weren’t thinking about the lifecycle of that object because they weren’t thinking about when things happen in the UI because they weren’t thinking about the sequence of the user’s interaction with the system because they weren’t thinking about how the software would actually get used or about what they actually •wanted• it to do when it worked.

The technical problem was really a design / lack of clarity problem. This happens •constantly• when writing code.

A good point from @rgarner, looking at this through the lens of Brooks’s Law:
https://mastodon.social/@rgarner/113642040777621582
@inthehands @rgarner image that through an act of god (or the courts …) you are suddenly in possession of a large, working code base. Say, Windows. Or FaceBook. But you have access to none of the engineers. Imho you might as well ignore the code, there is basically nothing you can do before you go insane. Bug fixing? Minor things, yes. Larger-scale refactor: out of the question. Having AI generate a non-trivial code base is like that. Plus it may not actually work.

@inthehands I appreciate how your line of argument chimes with Fred Brooks' "No Silver Bullet", along the lines of essential complexity of the problem and the solution matching up, except the comparison here being the embedded understanding (hopefully) in code written specifically for the problem, versus the embedded unchecked assumptions in generated code.

Novel software libraries have a similar problem: easy to adopt, not necessarily easy to evaluate.

@inthehands Oh, why hello, Amdahl!
@OmegaPolice
That’s it: Amdahl’s Law law except optimization actually creates large costs in the other parts of the system!
@inthehands The bet that a lot of these CXOs are making implicitly is that this will be like the transition from assembly to higher-level languages like C (I think most of them are too young and/or too disconnected to make it explicitly). And I'm not 100% sold on it but my 60% hunch is that it's not.
@kevinriggle
Yeah, I’ve heard that thought too. It’s tantalizing nonsense. I could write about this at length, and maybe one day I will, but the very short version is that automation is not even remotely the same thing as abstraction.

@inthehands Yes! Yes. This is it exactly.

One can imagine a version of these systems where all the "source code" is English-language text describing a software system, and the Makefile first runs that through an LLM to generate C or Python or whatever before handing it off to a regular complier, which would in some sense be more abstraction, but this is like keeping the .o files around and making the programmers debug the assembly with a hex editor.

@inthehands I would love to read your thoughts when you get a change
@kevinriggle
That’s exactly the line of thought, yes. And the thing that makes abstractions useful, if they are useful, is that they make good decisions about what doesn’t matter, what can be standard, and what requires situation-specific thought. Those decisions simultaneously become productivity boost, safety, and a metaphor that is a tool for thought and communication.
@kevinriggle
What happens when the semantics of your abstractive model are defined by probabilistic plagiarism, and may change every single time you use it? That might be good for something, I guess??? But it doesn’t remotely resemble what a high-level language does for assembly.
@inthehands One could imagine using a fixed set of model weights and not retraining, using a fixed random seed, and keeping the model temperature relatively low. I'm imagining on some level basically the programming-language-generating version of Nvidia's DLSS tech here. But that's not what people are doing and I'm not convinced if we did that it would be useful

@kevinriggle
Even if that gave semantically staple answers, which I’m not convinced it would, it still skips that all-important step where there’s communication and reflection and consensus building.

I suppose there’s some help in approaches where and LLM generates plausible answers and then some semantically reliable verification checks that the results aren’t nonsense. But we’re really stretching it.

@kevinriggle @inthehands
Libraries, tool sets, IDEs, modular code, and other forms of re-use have already picked up all the low hanging fruit like this. There's nothing for an "AI" to do here that isn't already more accurately done by an experienced coder.
@kevinriggle
The primary job of a development team is the creation and maintenance of a shared mental model of what the software does and how it does it. Periodically, they change the code to implement changes to the mental model that have been agreed upon, or to correct places where the code does not match the model. An LLM cannot reason and does not have a theory of mind and as such cannot participate in the model process or meaningfully access that model — written documentation is at best a memory aid for the model — and thus cannot actually do anything that matters in the process. The executive class would prefer that other people in the org not be permitted to think, let alone paid for it, and therefore willfully confuses the output with the job.
@inthehands

@dymaxion @kevinriggle

That’s all well said, and gets to what @jenniferplusplus was talking about here: https://jenniferplusplus.com/losing-the-imitation-game/

Losing the imitation game

AI cannot develop software for you, but that's not going to stop people from trying to make it happen anyway. And that is going to turn all of the easy software development problems into hard problems.

Jennifer++
@inthehands Another bull-case argument about LLMs is that they are a form of autonomation (autonomy + automation), in the sense that the Toyota Production System uses it, the classic example being the automated loom which has a tension sensor and will stop if one of the warp yarns break. But we already have many such systems in software, made out of normal non-LLM parts, and also that's ... not really what's going on here, at least the way they're currently being used.
@kevinriggle
Yeah, one of the troubles with the systems is that basically every metaphor we can think of for what they are is misleading.

@kevinriggle @inthehands
I find this idea quite amusing

I like the expression (German) "Kommunikation ist eine Aneinanderreihung von Missverständnis" (communication is a concatenation of misunderstanding)

Languages are so ambiguous, unclear, and context dependent that the idea of "just" writing English is funny.

And if you ever tried to write a specification or a manual you know that there is no "just"

@kevinriggle @inthehands

I have said recently that if LLMs could reliably and economically generate code that correctly implements the desired functionality, without bugs or security flaws (beyond typical of that for humans), then, quite obviously, we should keep, version, and maintain the English description ("prompts") describing what we want, and discard the entirely redundant generated code.

But we don't.

Because they can't.

And there's really no chance that they will any time soon.

@inthehands @kevinriggle
Also, automation (introduction or use of automatic equipment in a process), without defining the process that is being automated, only leaves you with automatic equipment with nothing relevant to do.
-Defined process * automatic equipment = automation :(0 * ∞ =0)
AND
- Automation/automatic equipment = No defined process (∞ * ∞= ∞)