In most cases, LLMs will not replace humans or reduce labor costs as companies hope. They will •increase• labor costs, in the form of tedious clean-up and rebuilding customer trust.

After a brief sugar high in which LLMs rapidly and easily create messes that look like successes, a whole lot of orgs are going to find themselves climbing out of deep holes of their own digging.

Example from @Joshsharp:
https://aus.social/@Joshsharp/112646263257692603

j# (@[email protected])

Yesterday we had another example of LLMs creating support issues for us. User: "hi, how do I do this thing? Your docs say I can go here and change some options, but there's no settings there" Me: "that's right, we don't have such a feature, but also we don't say you can do it in the docs, where did you read that?" User: "oh I didn't actually read the docs, I asked 'AI' and it hallucinated this answer. Sorry!" At this rate I'm looking forward to 2025 when I'll be spending 100% of my time doing support to correct falsehoods about our app made up by LLMs

Aus.Social

Those who’ve worked in software will immediate recognize the phenomenon of “messes that look like successes.”

One of my old Paulisms is that the real purpose of a whole lot of software processes is to make large-scale failure look like a string of small successes.

The crisp “even an executive can understand it” version of the OP is:

⚠️ AI increases labor costs ⚠️

(“Why?” “Because it’s labor-intensive to clean up its messes.”)

I said “the purpose of a whole lot of software processes is to make large-scale failure look like a string of small successes.”

Huh? What does that look like??

It looks like this:

✅ Meetings held
✅ Plan signed off
✅ Tests passed
✅ Iterations iterated
✅ Velocity increased
✅ Thing implemented
✅ Checkpoints checked
✅ Thing released
✅ Blinkenlights blink
✅ Line goes up
✅ Thing updated
❌ Software never •really• solves the problem it was supposed to solve in the first place, creates more problems

or ❌ Problem the project was supposed to solve in the first place was the wrong problem

or ❌ Nobody actually wanted it

or ❌ We totally failed to understand the real effect of implementing this

or ❌ The goal was designed to benefit some individual / faction within the company, not the mission

or ❌ The goal was designed to benefit the bottom line / investors / some horrid systemic evil, and is net harmful to humanity / the world

(Yes, I consider that last one a failure too.)

My most hilarious example of “large-scale failure looks like a string of successes:“

Years ago, I worked on a project for retailer Megacorp Y to sell their house-branded cables on Megacorp Z’s online sales platform. It was an integration project: wire up inventory, wire up payments. The tech side was sloppy (weird, ancient APIs, Z’s official API involved •FTP• transfers (yes, really)), but ultimately quite tractable.

The problem? Internal conflict between ambitious humans.

One team at Y did inventory, and a different team did payments. Both teams had ambitious (and pretty jerky) managers who wanted to control the •whole• project so they’d get credit for it when it launched. Both managers therefore wanted the other team’s side to fail.

We’d have meetings between the inventory and the payment teams where the engs would say, “We could do this!“ “Oooh, and then we could do this!” And the managers would suddenly cut in: nope nope nope, we can’t have this just •work•.

I cut the cord on that contract — the agreement had only even been for me to architect it and lay the technical foundation — and left figuring it would never, ever see the light of day.

About a year later, somebody came up to me at a conference.

THEM: You’re Paul? You worked on this project at Y??

ME: Yes…

THEM: You won’t believe it: it actually •got released•! Against all odds, it went out the door and into production!

ME: 😮😮

THEM: And it worked! Your code was great! In fact, it was so successful that it made several million in its first few weeks!

ME: 😵

THEM: …and it was so much money that it showed up as a line item on a report to the CEO, and the CEO said, “What's this?,” and when they told him, he said, “We're doing WHAT?!? Why are we selling our house-brand products on their platform??? Shut that down NOW!!!” and they pulled the plug on the whole thing!

ME: 🤣🤣

I guess neither manager got that promotion.

@inthehands *Had me in the first half, not gonna lie*
@rposbo Had •me• in the first half — and I was there!
@inthehands Monoprice cables aren't on Amazon anymore?? 😂
@dalias
I can neither confirm nor deny any specific companies involved in this story, but you have exactly the right idea
@inthehands hahahah! I was once brought in as a technical consultant by a primarily sales org that, wanting a product in a new sector to sell, had gone out and bought the two biggest competitors in the space and told them to merge to make a really good product. And they didn't understand why these two teams that had been built around beating each other and developing contradicting worldviews to differentiate their products couldn't just merge their products...
@kitten_tech
As the Minnesotans say, uff da
@inthehands All the locals: "Mm-hmm; 'Y,' 'Z.' We hear you."
@jima
Your guesses about Y and Z are probably wrong. Unless they aren’t. They might not be. Or not. I can neither confirm nor deny.
@inthehands Notably, AI is itself software and is subject to the same forces that produced this list. I'll leave it as an exercise to the other readers to figure out which of these red Xs applies in that case.

@inthehands

Don't forget the old Nokia management style:

or ❌ The goal was designed to harm some individual / faction within the company, not benefit the mission

@kissane

@maswan @kissane
Ha, coincidence! See the story I just posted downthread
@inthehands Usually all of the above... 🤦

@dalias @inthehands

And you have just, very adequately explained why AIs can't write useful code

@inthehands okay I'm booking this whole thread, it's so succinct and relatable. Honestly a nice reminder of the scope at play.

I'm curious, how pervasive would you say this is? Is this like every project in every org? Or half of half? Struggling to see past my own experiences, which are honestly a lot of this on repeat.

@geoffreyconley
I remember many years ago having an animated discussion with coworkers about what percentage of all corporate software projects are hoaxes (where “hoax” means “never really succeeds, even if it’s presented as a success).

Our conclusion is that we have no idea, and no way of finding out. Too many projects, wildly different places, no way of evaluating the question.

Truly, I have no idea. I just know this: it’s not rare.

@inthehands that seems fair enough! Thanks for your response!!

Sorry to ask, but in terms of personal philosophy, do you aim to find projects and employers that AREN'T this way? Do you end up accepting that it happens, and just do your part?

@geoffreyconley
The main things I look for in choosing a project:

- Cool people
- Worthy goal
- Fun problems to solve

With 1/3 I can survive. 2/3 is a great project.

I suppose I do consider the odds of the project being truly successful — but mostly I take a kind of Camusesque perspective: it’s necessary for orgs to try things, for projects to crash and burn, for people to explore what turn out to be dead ends. In the end, I just want to have been glad to have done the work.

@geoffreyconley
For example, with the downthread story about Megacorp Y, my main regret is •not• about the CEO pulling the plug on everything I did a year after I left. No, that’s just hilarious!

My regret is that I found out later one of the managers had been really emotionally abusive to some of the other devs, especially the one woman on the team, when I wasn’t present. I wish I’d realized that while I was there! I wish I’d fought back! But I was oblivious to it. Big regret.

@inthehands
My favorite is when the project targets a solved problem. It looks like a success because it achieves the goal as defined, but it was still a big waste of resources. This seems incredibly common to me.