Agreed with everything @kevinriggle wrote here. Another angle on this to try with people who simply do not understand what software engineering •is•: “What’s the impact on the other 7/8?”

AI can generate code fast. Often it’s correct. Often it’s not, but close. Often it’s totally wrong. Often it’s •close• to correct and even appears to work, but has subtle errors that must be corrected and may be hard to detect.

1/ https://ioc.exchange/@kevinriggle/113641234199724146

Kevin Riggle (@[email protected])

What I’m taking from this is that software engineers spend most of our time on engineering software, and writing code is (as expected) a relatively small portion of that work. Imagine this for other engineering disciplines. “Wow structural engineers seem to spend most of their time on meetings and CAD and relatively little time physically building bridges with their hands! This is something AI can and should fix. I am very smart” https://fortune.com/2024/12/05/amazon-developers-spend-hour-per-day-coding/

IOC.exchange

All the above is also true (though perhaps in different proportions) of humans writing code! But here’s the big difference:

When humans write the code, those humans are •thinking• about the problem the whole time: understanding where those flaws might be hiding, playing out the implications of business assumptions, studying the problem up close.

When AI write the code, none of that happens. It’s a tradeoff: faster code generation at the cost of reduced understanding.

2/

The effect of AI is to reduce the cost of •generating code• by a factor of X at the cost of increasing the cost of •thinking about the problem• by a factor of Y.

And yes, Y>1. A thing non-developers do not understand about code is that coding a solution is a deep way of understanding a problem — and conversely, using code that’s dropped in your lap greatly increases the amount of problem that must be understood.

3/

Increase the cost of generating code by a factor of X; increase the cost of understanding by a factor of Y. How much bigger must X be than Y for that to pay off?

Check that OP again: if a software engs spend on average 1 hr/day writing code, and assuming (optimistically!) that they only work 8 hr days, then a napkin sketch of your AI-assisted cost of coding is:

1 / X + 7 * Y

That means even if X = ∞ (and it doesnt, but even if!!), then Y cannot exceed ~1.14.

Hey CXO, you want that bet?

4/

This is a silly thumbnail-sized model, and it’s full of all kinds of holes.

Maybe devs straight-up waste 3 hours a day, so then payoff is Y < 1.25 instead! Maybe the effects are complex and nonlinear! Maybe this whole quantification effort is doomed!

Don’t take my math too seriously. I’m not actually setting up a useful predictive model here; I’m making a point.

5/

Though my model is quantitatively silly, it does get at the heart of something all too real:

If you see the OP and think it means software development is on the cusp of being automatable because devs only spend ≤1/8 of their time actually typing code, you’d damn well better understand how they spend the other ≥7/8 of their time — and how your executive decisions, necessarily made from a position of ignorance* if you are an executive, impact that 7/8.

/end (with footnote)

* Yes, executive decisions are •necessarily• made from a position of ignorance. The point of having these high-level roles isn’t (or at least should not be) amassing power, but rather having people looking at things at different zoom levels. Summary is at the heart of good management, along with the humility to know that you are seeing things in summary. If you know all the details, you’re insufficiently zoomed out. If you’re zoomed out, you have to remember how many details you don’t know.

Yes, this from @sethrichards is what I’m talking about:
https://mas.to/@sethrichards/113642032055823958

I had a great moment with a student the other day dealing with some broken code. The proximate problem was a Java NullPointerException. The proximate solution was “that ivar isn’t initialized yet.”

BUT…

Seth Richards (@[email protected])

@[email protected] I agree with all of this, and I'd add: When I'm writing code, I'm *learning* about the problem as well through the process. When I fix a bug in my code, I (hopefully) learn not to make the same mistake again. When I help someone on the team fix a bug in their code, we both learn something. If we write documentation or a unit test to make sure the bug doesn't happen again, the organization "learns" something too. It's unclear to me whether AI is even capable of learning in this way.

mas.to

…It wasn’t initialized because they weren’t thinking about the lifecycle of that object because they weren’t thinking about when things happen in the UI because they weren’t thinking about the sequence of the user’s interaction with the system because they weren’t thinking about how the software would actually get used or about what they actually •wanted• it to do when it worked.

The technical problem was really a design / lack of clarity problem. This happens •constantly• when writing code.

A good point from @rgarner, looking at this through the lens of Brooks’s Law:
https://mastodon.social/@rgarner/113642040777621582
@inthehands @rgarner image that through an act of god (or the courts …) you are suddenly in possession of a large, working code base. Say, Windows. Or FaceBook. But you have access to none of the engineers. Imho you might as well ignore the code, there is basically nothing you can do before you go insane. Bug fixing? Minor things, yes. Larger-scale refactor: out of the question. Having AI generate a non-trivial code base is like that. Plus it may not actually work.

@inthehands I appreciate how your line of argument chimes with Fred Brooks' "No Silver Bullet", along the lines of essential complexity of the problem and the solution matching up, except the comparison here being the embedded understanding (hopefully) in code written specifically for the problem, versus the embedded unchecked assumptions in generated code.

Novel software libraries have a similar problem: easy to adopt, not necessarily easy to evaluate.

@inthehands Oh, why hello, Amdahl!
@OmegaPolice
That’s it: Amdahl’s Law law except optimization actually creates large costs in the other parts of the system!
@inthehands The bet that a lot of these CXOs are making implicitly is that this will be like the transition from assembly to higher-level languages like C (I think most of them are too young and/or too disconnected to make it explicitly). And I'm not 100% sold on it but my 60% hunch is that it's not.
@kevinriggle
Yeah, I’ve heard that thought too. It’s tantalizing nonsense. I could write about this at length, and maybe one day I will, but the very short version is that automation is not even remotely the same thing as abstraction.

@inthehands Yes! Yes. This is it exactly.

One can imagine a version of these systems where all the "source code" is English-language text describing a software system, and the Makefile first runs that through an LLM to generate C or Python or whatever before handing it off to a regular complier, which would in some sense be more abstraction, but this is like keeping the .o files around and making the programmers debug the assembly with a hex editor.

@inthehands I would love to read your thoughts when you get a change
@kevinriggle
That’s exactly the line of thought, yes. And the thing that makes abstractions useful, if they are useful, is that they make good decisions about what doesn’t matter, what can be standard, and what requires situation-specific thought. Those decisions simultaneously become productivity boost, safety, and a metaphor that is a tool for thought and communication.
@kevinriggle
What happens when the semantics of your abstractive model are defined by probabilistic plagiarism, and may change every single time you use it? That might be good for something, I guess??? But it doesn’t remotely resemble what a high-level language does for assembly.
@inthehands One could imagine using a fixed set of model weights and not retraining, using a fixed random seed, and keeping the model temperature relatively low. I'm imagining on some level basically the programming-language-generating version of Nvidia's DLSS tech here. But that's not what people are doing and I'm not convinced if we did that it would be useful

@kevinriggle
Even if that gave semantically staple answers, which I’m not convinced it would, it still skips that all-important step where there’s communication and reflection and consensus building.

I suppose there’s some help in approaches where and LLM generates plausible answers and then some semantically reliable verification checks that the results aren’t nonsense. But we’re really stretching it.

@kevinriggle @inthehands
Libraries, tool sets, IDEs, modular code, and other forms of re-use have already picked up all the low hanging fruit like this. There's nothing for an "AI" to do here that isn't already more accurately done by an experienced coder.
@kevinriggle
The primary job of a development team is the creation and maintenance of a shared mental model of what the software does and how it does it. Periodically, they change the code to implement changes to the mental model that have been agreed upon, or to correct places where the code does not match the model. An LLM cannot reason and does not have a theory of mind and as such cannot participate in the model process or meaningfully access that model — written documentation is at best a memory aid for the model — and thus cannot actually do anything that matters in the process. The executive class would prefer that other people in the org not be permitted to think, let alone paid for it, and therefore willfully confuses the output with the job.
@inthehands

@dymaxion @kevinriggle

That’s all well said, and gets to what @jenniferplusplus was talking about here: https://jenniferplusplus.com/losing-the-imitation-game/

Losing the imitation game

AI cannot develop software for you, but that's not going to stop people from trying to make it happen anyway. And that is going to turn all of the easy software development problems into hard problems.

Jennifer++
@inthehands Another bull-case argument about LLMs is that they are a form of autonomation (autonomy + automation), in the sense that the Toyota Production System uses it, the classic example being the automated loom which has a tension sensor and will stop if one of the warp yarns break. But we already have many such systems in software, made out of normal non-LLM parts, and also that's ... not really what's going on here, at least the way they're currently being used.
@kevinriggle
Yeah, one of the troubles with the systems is that basically every metaphor we can think of for what they are is misleading.

@kevinriggle @inthehands
I find this idea quite amusing

I like the expression (German) "Kommunikation ist eine Aneinanderreihung von Missverständnis" (communication is a concatenation of misunderstanding)

Languages are so ambiguous, unclear, and context dependent that the idea of "just" writing English is funny.

And if you ever tried to write a specification or a manual you know that there is no "just"

@kevinriggle @inthehands

I have said recently that if LLMs could reliably and economically generate code that correctly implements the desired functionality, without bugs or security flaws (beyond typical of that for humans), then, quite obviously, we should keep, version, and maintain the English description ("prompts") describing what we want, and discard the entirely redundant generated code.

But we don't.

Because they can't.

And there's really no chance that they will any time soon.

@inthehands @kevinriggle
Also, automation (introduction or use of automatic equipment in a process), without defining the process that is being automated, only leaves you with automatic equipment with nothing relevant to do.
-Defined process * automatic equipment = automation :(0 * ∞ =0)
AND
- Automation/automatic equipment = No defined process (∞ * ∞= ∞)

@inthehands @RuthMalan every dev wants a greenfield project. LLMs shade even greenfield projects brown.

But then it's not the devs that are asking for this* so much as a managerial class looking for the sort of silver bullet that brings down both pay and the amount of time dealing with a type of worker they find difficult.

*not the ones who are any good, anyway

@rgarner @RuthMalan
Yup. All that.

And Brooks’s maxim that there is no silver bullet still stands undefeated.

@inthehands @RuthMalan and that thing about adding people to projects. An LLM isn't a person quite so much as the average of some.
@rgarner @inthehands @RuthMalan Oh, I like this one. Even if it were an actual person, it’s a person who read your code but none of the design docs and hasn’t participated in any of your team’s discussions. They’d have a good chance of coming up with something reasonable, but could also totally bodge it without realizing it.

@jrose @rgarner @inthehands @RuthMalan I’m totally on your side in this fight, but very soon the AI will have been in your team’s discussions. Or at least the other AI, the one that’s always listening to Slack and Zoom, will have written a summary of those discussions that’s in the coding AI’s context. Design docs too.

Fully-remote teams will have an advantage. At least until RTO includes wearing a lapel mic at all times…

@jamiemccarthy @jrose @inthehands @RuthMalan so far, my experience is: they may have seen the discussion, but they don't "remember" it, and they certainly have no idea which are the salient points. Sometimes even when you ram said points down their throats.

In short, I'm fine asking them to show me a depth-first search, but I would trust them with architecture and logical design decisions about as far as I could comfortably spit a rat.

@rgarner @jrose @inthehands @RuthMalan 100% agree with the overall thrust of what you’re saying

@jamiemccarthy @rgarner @jrose @RuthMalan
Yeah, I'm with Russell here: the whole “soon the AI will think” line simply isn’t justified by either theory or evidence. It’s akin to thinking that if you make cars go fast enough, eventually they’ll travel backwards in time.

Re summarization specifically…

@jamiemccarthy @rgarner @jrose @RuthMalan
…there was a recent paper (lost link, sorry) that systematically reviewed LLM-generated summaries. They found in the lab what people have observed anecdotally: LLMs suck at it because they don’t know what the point is. They’re great at reducing word count in a grammatically well-formed way! But they often miss the key finding, highlight the wrong thing, etc.

@rgarner @inthehands @RuthMalan

> LLMs shade even greenfield projects brown.

Yes! Yes. Yes. This. I'm basically in cybersecurity because I kept trying to reuse existing software, and eventually my skill at reading and reviewing code, and understanding the larger systems that it's part of, became more valuable than my skill at writing code.

Early in my career I would regularly get frustrated by colleagues who would rewrite things from scratch when there were okay-ish tools available that could be refit for purpose, but in retrospect I see that they often had at least something of a point.

I've lost touch with all of them now, but I can't imagine these folks being terribly enthusiatic about LLMs, except to the extent that they allow them to reimplement the wheel faster. (And there's my lingering judgment slipping out again, whoops)

@rgarner @inthehands @RuthMalan

Them: Why are you bodging that tool to do something it's not designed for?

Me: Why are you writing an entirely from-scratch software system to receive and send emails?

Them: Yes, but the tool you're using doesn't receive and send *these particular emails*

Me: EVERY SOFTWARE SYSTEM EXISTS TO RECEIVE AND SEND EMAIL AND BY GOD I WILL MAKE THIS PILE OF PERL RECEIVE AND SEND THESE PARTICULAR EMAILS IF IT'S THE LAST THING I DO

@rgarner @inthehands @RuthMalan So how can we developers become less difficult for managers to work with, so they'll be less likely to want to replace us?
@inthehands
Coding is teaching a really, really dumb student how to solve a problem.
Teaching something is the best way to understand it properly.
@inthehands this is an excellent way of framing it, thank you. i will be using this framing in the future when i discuss this with others
@inthehands My 30+ years as a dev disagree with the statement, "coding a solution is a deep way of understanding a problem", on semantics. Analyzing a solution is a deep way of understanding a problem. I've found that too many devs are in a rush to code and give short shrift to the analysis.

@lwriemen This is an overly pedantic quibble, though I agree with the underlying sentiment that people rush into coding too fast without thinking.

Doing the work of filling the things I left between the lines for the reader to infer above: coding something •while thinking• — assessing the results, letting the ideas talk back and surprise, treating design and implementation problems as prompts to think about goals and context — is a deep way of understanding a problem.

@inthehands I don't know if it's overly pendantic, because coding is the least thing in software development, which is why it is the ripest part for smart automation. (I felt the qualifier, "smart" was necessary, because there has been way too much dumb automation done already.)