Mastodawn

"I used AI. It worked. I hated it." by @mttaggart https://taggart-tech.com/reckoning/

This is a really good blogpost. And I"m sure it'll make some people unhappy to read whether they're pro or anti genAI. What's good about @mttaggart's blogpost is he talks honestly about how using Claude Code did actually solve the problem he set out to do. It needed various guardrails, but they were possible to set up, and the project worked. But the post is also completely clear and honest about how miserable it was:

- It removed the joy from the process
- If you aim to do the right thing and carefully evaluate the output, your job ends up eventually becoming "tapping the Y key"
- Ramifications on people learning things
- Plenty of other ethical analysis
- And the nagging wonder whether to use it next time, despite it being miserable.

I think this is important, because it *is* true that these tools are getting to the point where they can accomplish a lot of tasks, but the caveat space is very large (cotd)

I used AI. It worked. I hated it.

I used Claude Code to build a tool I needed. It worked great, but I was miserable. I need to reckon with what it means.

Show thread

Christine Lemmer-Webber 1d ago

What I think is also good about the piece is that it shows how using this tech eventually funnels people down a particular direction. This is captured also by this exchange on lobste.rs: https://lobste.rs/s/7d8dxv/i_used_ai_it_worked_i_hated_it#c_7jirfk

The story that people start with vs where they go is very different:

- They're really just for experts, and are assistants, they don't write the code for you
- Okay the write a lot of the code for me, but I personally don't commit anything without reviewing
- YOLO mode

Which eventually leads you to becoming the drinky bird pressing the Y key from that Simpsons episode. (Funnily enough I wrote that in my comment on lobste.rs in reply to someone else before I had even gotten to the point where I saw that @mttaggart literally had that gif)

And at that point, you're checked out. All that's left is vibes.

And unfortunately, these systems don't survive that point very well. And neither do you, in your skills and abilities.

Show thread

Christine Lemmer-Webber 1d ago

There are a lot of other concerns but I think since a lot of people on the fediverse are opposed to these tools, they might not be very familiar with where they're currently at ability-wise. @mttaggart provides a good description that they *are* capable of solving many problems you put in front of them... and that doesn't remove the other problems they generate or involved in their process.

The slop part isn't just the individual outputs, but the cumulation, and the effect on society itself.

Show thread

Christine Lemmer-Webber 1d ago

Is that pushing the goalposts? It may be. I think "slop" used to be easier to dismiss when it came to code because it was obviously bad. Now when it's bad, it's non-obviously bad, which is part of its own problem. And cognitive debt, deskilling, and etc don't get factored into the quality of output aspect.

But unfortunately, the immediate reward aspects of these things are going to make it hard for society to recognize.

Show thread

Diane 1d ago

@cwebber

It kind of feels like its going to something big happening in the press to get people to stop.

I was thinking an AI caused Therac 25, but maybe a copilot worm that wipes all windows 11 computers might get some outlawing AI code legislation.

Show thread

Michael Bacon 1d ago

@alienghic @cwebber

The thing most likely to get people to stop is the end of the massive subsidies for its use that the VCs are currently pouring in.

Already firms are starting to panic a little about token use for things like Claude Code, and are putting limiters in their workers that really defeat the purpose of all of the "YOU MUST USE THIS OR BE FIRED" diktats. But operating indefinitely at those prices will bankrupt Anthropic soon.

So at some point the private equity love affair with everything AI will dry up (possibly because of a Iran war-induced financial crisis), and at that point it's going to be "my org can spend $50k annually on my personal Claude tokens to make me 20% more productive . . . or it could just hire a junior dev?"

There's a chance they manage to optimize this, or get it to work using a lighter weight model. But I think it's unlikely.

Show thread

mirth 21h ago

@MichaelTBacon @alienghic @cwebber I don't see a stopping in the near term. PE hasn't done a lot of real AI deals, I think that's blocked on a lack of proven playbooks. VCs are making bets but the actual end user value is pretty unclear. Having studied this area and its trajectory quite a bit over the last year I think the unit economics of API serving are already approximately sustainable, and the models and hardware designs continue to get cheaper for a given level of performance. ...

Show thread

mirth 21h ago

@MichaelTBacon @alienghic @cwebber ... Right now the big firms are loading up on cash and I think working hard to cut the cost of their subscription products (e.g. ChatGPT or Claude) to the point where they'll be able to run unsubsidized in the near future at something not far from the current output quality and pricing. Their priority appears to be to sell more seats at low cost (Claude Enterprise starts at $20 a seat) and hope that they can get entrenched before starting to ramp prices up.

Show thread

mirth 21h ago

@MichaelTBacon @alienghic @cwebber I haven't seen any hard data, but spending enough time in tech industry circles it seems to be working.

Show thread

Michael Bacon 19h ago

@mirth @alienghic @cwebber

So far, from what I've seen, any time one of the subscription AI places put up their prices to something resembling actual operating costs (nevermind paying back gigantic sunk capital costs), users have screamed and then bolted.

Honestly, doing the really heavy duty Claude Code stuff that's getting pushed now will easily run to $50k per developer at current costs. And no, I don't see that as something that enterprises will ultimately be willing to swallow. Nor do I see a path for them to get the GPU cycle burn down easily.

Show thread

mirth 18h ago

@MichaelTBacon @alienghic @cwebber That math sounds way off. Assuming a monthly usage of 5M tokens for day to day developer usage, at the current Claude API costs, and billing them all at the highest rate ($25 per M), that's $125 per month at current pricing. It's a long way from there to $50k, and surveying the trajectory over the last couple years as well as models from some of the Chinese labs it's pretty clear that model size necessary to do these tasks is trending down.

Show thread

mirth 18h ago

@MichaelTBacon @alienghic @cwebber The other thing happening is there are many efforts to build special-purpose chips for these workloads, and some will eventually pan out. Big neural nets on GPUs are extremely wasteful in energy terms, and even though many people seem to think that approach is horribly wrong it's become "too big to fail" in a way that will encourage investment into new chips until something sticks.

Show thread

mirth 18h ago

@MichaelTBacon @alienghic @cwebber Combine a downward trend in average model complexity (by usage) and downward trend in energy consumption (on new hardware) on top of a typical usage that currently costs perhaps $100 - $1000 at the high end... I can easily see a world of $500/month/seat subscriptions without any structural changes. I'm not saying it's good or that I like it, but based on the best information I can find I don't think the "price explosion" scenario is plausible.

Show thread

Michael Bacon 18h ago

@mirth @alienghic @cwebber

What downward trend in average model complexity? What downward trend in energy consumption? They're both going up! Nobody can get the cost of inference to go down outside of going with discount models like Deepseek which are okay for spouting text but you can't get anywhere near the code quality of something like Claude Code (and even with CC, as the OP link says, quality is still something that only works well in certain languages and in certain situations, with lots and lots of guard rails).

Ed Zitron isn't everyone's cup of tea, but he's been watching the finances of this for a while and there's absolutely no sign of the burn rate slowing down or the cost of inference dropping.

https://www.wheresyoured.at/the-subprime-ai-crisis-is-here/

The Subprime AI Crisis Is Here

Hi! If you like this piece and want to support my independent reporting and analysis, why not subscribe to my premium newsletter? It’s $70 a year, or $7 a month, and in return you get a weekly newsletter that’s usually anywhere from 5,000 to 18,000 words,

Ed Zitron's Where's Your Ed At

Show thread

Michael Bacon 18h ago

@mirth @alienghic @cwebber

Anthropic gets some credit for getting Claude Code to actual usability and decent code if you spend enough time scolding and cajoling the model and manually forcing it through various code quality assurances. But they're not doing it on cheap models, they're doing it on the biggest, most expensive models which require the biggest and most expensive GPUs. You can't get those results out of Deepseek or Ollama or any of the smaller, cheaper models. The code quality goes right back into the toilet, no mater what guard rails you put on it.

Given the horrific mess that is the Claude Code source code (see this megathread for a walk through the chaos fractal that is Claude Code https://neuromatch.social/@jonny/116324676116121930) it's possible that they could tighten the hell out of it and clean up some of the immense noise in it to get some efficiency. But then what does that say about Claude Code's code quality?

Show thread

Michael Bacon 17h ago

@mirth @alienghic @cwebber

As for the custom chips, I'm not sure how much more customized you can make a chip for ML models than what NVIDIA is cranking out, but at the very least here's what's going on with Microsoft's attempts to get Azure to work on smaller hardware. This is a really sobering read from a former MS system engineer.

Certainly, the capability of ARM chips to really change cloud computing if someone can get the ultra-efficient ones to scale up shouldn't be overlooked. And someone else who isn't Microsoft will probably figure it out (although AWS in particular is also staggering under its immense technical debt right now).

But there is just one titanic mess after another under the hoods of the major tech firms burning hundreds of billions of VC dollars right now.

https://isolveproblems.substack.com/p/how-microsoft-vaporized-a-trillion

How Microsoft Vaporized a Trillion Dollars

Inside the complacency and decisions that eroded trust in Azure—from a former Azure Core engineer.

Axel’s Substack

Show thread

mirth

@MichaelTBacon @alienghic @cwebber I read the Ed Zitron piece with some interest but it's weakly sourced and light on actual analysis (though it has a lot of links). I think he is right that AI companies are spending eye watering amounts of cash but misunderstands why or what will likely happen when the game of musical chairs stops. Massive layoffs, other financial carnage, yes, but the insiders will still be rich and maintain control of the post-restructuring profits. Corrupt.

Show thread

mirth 7h ago

@MichaelTBacon @alienghic @cwebber Regarding models, size for a given output quality has been falling fast for the last couple years. Well documented. IMO a major threat to OpenAI et al is if on-device models pass the "good enough" line for casual users and destroy the unit economics of their subscription businesses. Apple and Google have privileged access to user data via their OSes, but not yet good enough models. They're motivated to try.

Show thread

mirth 7h ago

@MichaelTBacon @alienghic @cwebber The subscription products are moving to multi-model hybrids under the heading of "model routing" and "sub agents" and related schemes. I think primarily motivated by cost although I don't know if they admit it in public. This already exists in the current products in small ways but they'll likely push it a lot farther.

Show thread

mirth 7h ago

@MichaelTBacon @alienghic @cwebber Re: chips... That post about Azure is quite interesting but not related to AI accelerators. The core workload for these neural nets is almost totally unrelated to what a general purpose CPU does, and for large models the majority of energy consumption is DRAM and interconnect (e.g. not the arithmetic). So the answer to how much more customized they can get is "a lot" if you frame the problem as how to avoid interconnect and DRAM usage. Wafer scale etc.

Show thread

mirth 6h ago

@MichaelTBacon @alienghic @cwebber Between improved models, app designs, and hardware, it seems costs would come down. Do you have a source for operating costs being 10x greater than the current API pricing, or typical developer usage being 2B tokens a month? I have access to some developer usage data and I'd guess devs using every day average under 10M tokens/dev/month. I'm sure there are some wild outliers, but that is a typical problem many subscription businesses manage.

Show thread

mirth 6h ago

@MichaelTBacon @alienghic @cwebber I remain hopeful that as the science improves there will be a less exploitative and resource-intensive way to construct and use the technology. We will find out.

Show thread

Michael Bacon 2h ago

To be clear, the 10x number isn't something that's based on a specific set of numbers, but a gesture at the direction things are going. Anthropic reported that its $200/month Max users were costing it $50k/month in compute. (https://www.revenuememo.com/p/how-does-anthropic-make-money) Those are the Claude Code users that are reporting amazing coding results, but it's at the expense of using the newest, biggest models burning gigantic compute.

But those models *still* need lots of guard rails. How will they fix those? Very likely the best solution they'll find is a *bigger* model and even *more* agents.

And we haven't seen any Moore's Law effects come in with GPUs. GPU time still costs basically what it did 3 years ago. (https://semianalysis.com/gpu-pricing-index/)

How Anthropic makes money: Selling intelligence instead of ad inventory

Instead of monetizing attention, Anthropic is betting that businesses will pay for reliable reasoning, scalable automation, and AI that can do real work.

Revenue Memo

Show thread

Michael Bacon 1h ago

Of course Anthropic can't eat those kinds of compute losses forever, or even for very long. So they're trying to bring user pricing at least closer to operating costs and it's royally pissing off the vibe coding stans (https://www.nbcnews.com/tech/tech-news/claude-code-ai-mythos-leak-rcna266083).

To get revenues to keep up with its usage growth, the big AI guys need something like 10-40x the data center capacity they have now. But note that current data center capacity is *already* generating massive blowback for driving up energy prices, creating noisy heat islands, creating water shortages, and of course heating up the climate, and that was before an Iran-induced oil crisis.

There will be something left after all the dust settles, but currently people are advocating adopting tech based on gigantic subsidies with no indication that it will become affordable without subsidies. That's really worrisome.

Claude’s new limits are frustrating its most devoted users

Claude chatbot users are getting fed up with the recent rollout of new usage limits as Anthropic struggles to keep up with surging demand for its AI systems.

NBC News