bad news "AI bubble doomers". I've found the LLMs to be incredibly useful and reduce the workload (and/or make people much, MUCH more effective at their jobs with the "centaur" model).

Is it overhyped? FUCK Yes. Salespeople Gotta Always Be Closing. But this is NOTHING like the moronic Segway (I am still bitter about that crap), Cryptocurrency, which is all grifters and gamblers and criminals end-to-end, and the first dot-com bubble where not NEARLY enough people had broadband or even internet access, plus the logistics systems to support shipping products was nowhere REMOTELY where it is today.

If you are expecting this "AI bubble" to pop anytime soon, uh.. you might be waiting a bit longer than you think? Overhyped, yes, overbuilding, sure, but not remotely a true bubble any any of the same senses of the three examples I listed above 👆. There's something very real, very practical, very useful here, and it is getting better every day.

If you find this uncomfortable, I'm sorry, but I know what I know, and I can cite several dozen very specific examples in the last 2-3 weeks where it saved me, or my team, quite a bit of time.

@codinghorror “I can cite several dozen very specific examples in the last 2-3 weeks where it saved me, or my team, quite a bit of time.”

Please do, if you can. Because most time I’ve tried to use LLMs for work the error rate ends up costing me MORE time than I would have spent without, and most AI boosters are short on specifics. We just had a presentation at my job on how we all need to be using AI with no case studies of how it’s actually been useful so far.

@sethrichards here's one: a friend confided he is unhoused, and it is difficult for him. I asked ChatGPT to summarize local resources to deal with this (how do you get ANY id without a valid address, etc, chicken/egg problem) and it did an outstanding, amazing job. I printed it out, marked it up, and gave it to him.

Here's two: GiveDirectly did two GMI studies, Chicago and Cook County and we were very unclear what the relationship was, or why they did it that way. ChatGPT also knocked this out park and saved Tia a lot of time finding that information out, so she was freed up to focus on other work.

I could go on and on and on. Email me if you want ~12 more specific examples. With citations.

But also realize this: I am elite at asking very good, well specified, very clear, well researched questions, because we built Stack Overflow.

You want to get good at LLMS? learn how to ask better questions of evil genies. I was raised on that. 🧞

@codinghorror @sethrichards "learn how to ask better questions of evil genies." Journalism and interviewing gets you halfway there.
@evanwolf @sethrichards or community management, where the best reference books are hostage negotiation books. Also, that's not a joke. They are the best reference books for community management, hands down.
@codinghorror @evanwolf @sethrichards wait, what? I'm intrigued! Do you have any specific recommendations for hostage negotiation books that were useful in your community management experience?
Hostage Negotiation - The top FBI negotiator teaches you to persuade:

Chris Voss led international hostage negotiation for the FBI. in this interview he teaches you the secrets to getting what you want.

Barking Up The Wrong Tree
@codinghorror Does the benefit you found here go beyond e.g. Covey's "First try to Understand" principle, or reflective listening and the like? I'm intrigued but also very disappointed by most non-fiction books with that one weird trick you really got to know, spread across 200 pages.
@ctietze hostage negotation is extremely hardcore and very effective. I'll drill in later in a blog post.
@codinghorror looking forward to that -- hope I'll catch it in my RSS reader
@codinghorror @sethrichards good point, this week I learned about prompt engineering to optimize results.
@codinghorror @sethrichards I see, everyone else is using it wrong... Tools this bad/difficult should not be being presented how they are. People are not going to invest days in prompt engineering before asking a generic LLM for medical advice.
@falken @codinghorror @sethrichards the tool over-hype masks the huge underlying utility. If you think of it as a tool that you need to learn to use and not your “AI co-worker” or other hype-fueled nonsense, it is quite productive. Example: I am not an iOS engineer but I’m a principal engineer. With me directing Claude Code, reviewing its work, setting code standards, and asking for refactors, I produced thousands of lines of well-factored, tested code iOS code that passed review and shipped.
@relistan @codinghorror @sethrichards I'm fairly sure I could learn Xcode that well without boiling an ocean tho
@falken @codinghorror @sethrichards learning XCode? I’m not even using it, that’s the tip of the problem. Go from zero to shipping 5k lines of production code + tests in a week on an existing project that you have very little context on. But, the environmental impact right now is ridiculous. Many of the companies doing stupid stuff will go out of business. Hardware will get better fast. I am convinced about the utility. But we’ve got to rapidly remove the downsides or it won’t work long term.
@falken @relistan @sethrichards boiling the ocean? I've told you a million times to stop exaggerating!
@codinghorror @sethrichards Jeff - Can you package your knowledge on how to "ask better questions"?
@codinghorror @sethrichards "A friend is unhoused, and although I can afford to bet a million dollars, I didn't house him, I gave him some AI slop". What a saint you are!
@denisbloodnok @sethrichards well, we allocated $69m to this, and I'd have to get divorced to do more, because half is what we agreed on (though that is, in fact, more than half). If you'd like me to get divorced, let me know. Refer to: https://blog.codinghorror.com/stay-gold-america/ and https://blog.codinghorror.com/the-road-not-taken-is-guaranteed-minimum-income/
Stay Gold, America

We are at an unprecedented point in American history.

Coding Horror
@codinghorror @sethrichards Except we know perfectly well you can afford to bet a million dollars. You've got the money, you'd just rather serve slop.

@denisbloodnok also this kind of investment at this kind of scale is largely conscience-laundering. not saying it's not useful, it's just inefficient. it's probably the best this type of guy can manage, given how abstract these concerns are to them.

@codinghorror
@sethrichards

@thegarbagebird @denisbloodnok @sethrichards I don't think so; feel free to look at the GMI study data.
@denisbloodnok @codinghorror @sethrichards like when you post a question on fedi and someone googles it for you but IRL
Jeff Atwood (@[email protected])

@[email protected] @[email protected] I apologize for not explaining in more detail. My partner regularly volunteers at the Alameda Food Bank. I asked her first since she knows a lot more about this, because almost any food bank needs to verify you are a "resident" of the area, which is challenging when you are unhoused. We went over it together manually for a while, then refined with a very detailed prompt, and then checked the results against what we knew. I also wrote this person a $19k check earlier in the year, which is the maximum allowable tax-free donation to an individual per year. I hope that helps. Any followup questions, feel free to email me, or set up a call via any method you like, and I can add additional detail as well. I love this person very, very much.

Infosec Exchange

@codinghorror @sethrichards

Evil genies with a severe form of ADD of some sort.

You hit it on the head - the prompt is the key.

With an experienced human - vagueness is often acceptable, and they will usually ask for clarification. The AI doesn't ask - it guesses, often incorrectly. So you need to over-specify in the prompt, including things it might be insulting to mention when talking to an experienced human. Then iterate, and aggressively steer that conversation.

This is why I don't see the AI as replacing a human except for trivial situations. It's a force multiplier, but not a replacement, and the skills necessary to use them effectively are non-obvious.

@tbortels @codinghorror @sethrichards they're also incredibly susceptible to being mislead by the prompt itself...

something like "tell me how to use X to accomplish Y" when you _think_ X is relevant is much more likely to lead to something made up than "tell me how to accomplish Y"

@vt52 @codinghorror @sethrichards

To be 100% fair - yeah. GIGO, garbage in garbage out.

But - that's not a problem exclusive to AIs, or even computers. That's the XY problem, and it's a human thing.

https://en.wikipedia.org/wiki/XY_problem

XY problem - Wikipedia

@tbortels @codinghorror @sethrichards oh for sure, but it's the kind of situation where a human person is likely to recognize the underlying fault and correct, where LLMs will try to yes-and with a plausible-sounding but completely fabricated answer

to your point, there's got to be a knowledgeable (or at least vaguely savvy) human involved

@vt52 @sethrichards @codinghorror

You must be hanging out with some high quality humans. I can't tell you how many times I've had this conversation, in professional settings:

Me (noticing an issue): hey, how's it going?
Them: not so well - this isn't working,
Me: I'm not surprised. It doesn't work that way. How long has this been broken?
Them: two weeks,
Me: wow. Why didn't you ask for help?
Them: I did, but they told me this was the procedure.
Me: ಠ_ಠ

"Recognize a problem and change your course" is a learned skill, even for humans.

To be fair - my job is usually "figure out what is broken and fix it" by all sorts of names.

@tbortels @vt52 @sethrichards yes, agree.. also I have NEVER advocated for anything other than the centaur model -- a human with experience in that domain, working with LLMs as a research assistant. https://blog.codinghorror.com/changing-your-organization-for-peons/
Changing Your Organization (for Peons)

James Shore’s nineteen-week change diary is fascinating reading: It was 2002. The .com bust was in full slump and work was hard to find. I had started my own small business as an independent consultant at the worst possible time: the end of 2000, right as the bubble popped.

Coding Horror

@codinghorror @sethrichards

So your argument is simultaneously:

> LLMs are useful RIGHT FUCKING NOW for SO MANY scenarios

But also, they're only useful because:

> I am elite at asking very good, well specified, very clear, well researched questions, because we built Stack Overflow.

Is it then fair to say that LLMs are likely to be very misleading for people who do not have your "elite" experience?

If not, why not?

@nikclayton @sethrichards people also suck at writing titles to emails. So yeah, we need to teach people how to write.
@codinghorror That's helpful, thanks. I guess the follow-up question is, how do you know you can trust the output? That's the real crux of the issue for me. My experience has been that these systems will confidently generate a mix of right and wrong info, while stripping all the context clues I've traditionally used to gauge validity. If I have to resort to traditional web searches etc. to check their work then I might as well skip the LLM step.
@codinghorror My suspicion has been that many of the people reporting huge time savings are mostly piping LLM output directly into their work products without checking it, which sounds like a wonderful way to generate subtle technical debt. But maybe there's something I'm missing.

@codinghorror @sethrichards@mas. Those examples do not make it clear to skeptical drive-by readers like me how you established the extent to which the output you received was actually correct

Is part of the magic value add to embrace the idea that for many activities, being "actually correct" isn't the most important criteria? Compared to, eg, just having a direction to get started in.

If someone could reference or breakdown examples that did unpack actual correctness, that would be persuasive.

@codinghorror The problem is not "LLMs are useless and when the bubble bursts they go away," they aren't going away any more than websites went away when the .com bubble burst.

The problems are
1. They are a 6/10 tool being advertised as an 11/10 tool with the folks selling this stuff consistently overstating what they're capable of doing.
2. The few hundred billion spent building them needs the 11/10 promises to come true in order to be justified.
3. They're really good at making up answers that appear *plausible* but are also completely wrong, and verifying the answers is becoming increasingly difficult as the top search results are increasingly flooded with output from the same LLMs.
4. 'AI' is being used to try to sell a bunch of completely unrelated stuff like 'copilot+ pcs' even though everything meaningful in the LLM space only runs in datacenters due to GPU memory limitations.

@malwareminigun @codinghorror I would add one more, at least 50% of use cases are already solved in a more deterministic/efficient automation way.
@mapache Making a claim like that requires data I do not have, and that claim does not match my personal experience. In particular, because the first few things I tried to do with LLMs were tasks like that (e.g. "for every line that says =x64-windows in this file, duplicate that line as =x64-windows-static") and they were *awful* at it. But cases like "I am not a macOS sysadmin expert, help me figure out why our build machines have half their disks filled with garbage after a few weeks" leading to "the problem is Spotlight being awful" have worked out great. @codinghorror
@malwareminigun @mapache document it in detail and blog it so everyone can benefit from what you've learned
@codinghorror @malwareminigun @mapache So your content can be stolen by the same slop machine billionaires?

@malwareminigun @codinghorror

LLMs won't go away but a lot of the companies selling them will. This will be quite disruptive.

Prices have to rationalize, I am not quite sure how that will work if $200/mth is not profitable. I don't see any path to the models becoming 100 times cheaper to generate. This implies that some category of folks will have to accept much worse models for a reasonable price point.

Alternatively, all the start-ups disappear into existing big tech companies and they dramatically reduce spending and folks get model updates at a much much slower frequency.

Either way this will be a very different landscape then we have today.

@shafik Sure, but that's exactly the .com bubble playbook.

@malwareminigun

Yeah, people keep saying that and I don't agree.

The sheer concentration of money and small number of core ideas involved make this very weird. Out of the destruction of the dot com bubble came a lot of fields and verticals. I don't think this will happen here.

If we end up w/ the same big seven that we started out w/ before the bubble and all of them basically converge on LLMs w/ basically the same capabilities that will be a very sad result for all the money that went into it. Very pale shadow indeed of what the dot com bubble wrought.

@shafik The 'winners' of the .com bubble were the same telcos and hardware vendors that were incumbent before the bubble. I don't see why the LLM bubble would be any different.

@malwareminigun

I don't believe we see equivalent creations such as Amazon, Ebay, Nvidia or Google etc out of this. I believe all of today's starts-up will ended up absorbed.

The telcos all ended up cannibalizing each other, overcapacity ended up being bad for business. Although it worked really nice for consumers.

Maybe we could see a cannibalization of hosted services. I am not sure we will really see all the capacity they are planning actually built.

@shafik @malwareminigun "This implies that some category of folks will have to accept much worse models for a reasonable price point. " you are conveniently ignoring that fact that it IS possible to come up with far more efficient models in compute time, that are almost as good -- or perhaps even better!

@codinghorror @sethrichards when I was in uni we learned about specifying pres and posts to out functions as contracts and a way to derivate the algorithm from the post to the pre. Not only the derivation was already hard, sometimes defining the pre and post was as hard to define as solving the problem in the first place.

1/

@codinghorror @sethrichards

And now you're saying that you not only have to double check every answer, you also have to be very good at asking the question to begin with? And these tools get the answers wrong if you don't? And they're released to the unsuspecting, untrained, I-have-no-free-spoons general public?

@mdione @sethrichards not exactly. They don't get it wrong, most people's questions are fairly rudimentary. Just ask anyone who works the help desk at {any company I can think of}
@codinghorror @sethrichards and still, you don't know if your question is rudimentary or not, so you can't trust the answer until you double check.

@codinghorror

they don't get it wrong?

so all the stories about "hallucinations" are fake news?

Apple Intelligence | Create memory movies | iPhone 16

YouTube
@finitebaffle @ret @sethrichards see also, Black Mirror episode "The Entire History of You" .. this is coming, and it's not gonna be long, either (well, if smart glasses take off and stop looking ridiculous, so..) https://www.youtube.com/watch?v=3bFCqK81s7Y
The Entire History of You | Black Mirror

YouTube

@codinghorror @sethrichards

Wondering how this impacts the massive amount of water and electricity required to operate these? Water rights and electricity that the public needs to survive.

@codinghorror you outsourced an interaction. you should probably just stop being friends with that person if you are automating your relationship with them. optimise properly.

@sethrichards

@thegarbagebird @sethrichards I apologize for not explaining in more detail. My partner regularly volunteers at the Alameda Food Bank. I asked her first since she knows a lot more about this, because almost any food bank needs to verify you are a "resident" of the area, which is challenging when you are unhoused. We went over it together manually for a while, then refined with a very detailed prompt, and then checked the results against what we knew. I also wrote this person a $19k check earlier in the year, which is the maximum allowable tax-free donation to an individual per year. I hope that helps. Any followup questions, feel free to email me, or set up a call via any method you like, and I can add additional detail as well. I love this person very, very much.
@codinghorror @thegarbagebird @sethrichards one of the more useful applications of LLMs is cutting red tape. If you know that a report is basically filed unread because it is just a requisite to deter people from claiming what they are entitled to, using a machine to produce a BS filing is not only legit but should be encouraged.
Obviously, the correct way to go about that would be to drop the requirement, but that’s not going to happen.

@klongeiger
it doesn't cut it, though. it outsources it.

i personally wouldn't use it for that, because if my document got picked in a random audit and the llm had generated non-useful output, then i could run the risk of being done for fraud.

objectively, i don't believe in luck. subjectively, i am that unlucky.

@codinghorror @sethrichards

@thegarbagebird @klongeiger @sethrichards it can definitely depend on which field you work in, for sure. I think in some fields it is a far better fit than others, but I'd also say it's a good fit for most fields that don't require medical grade precision.