Mastodawn

Christine Lemmer-Webber 1d ago

"I used AI. It worked. I hated it." by @mttaggart https://taggart-tech.com/reckoning/

This is a really good blogpost. And I"m sure it'll make some people unhappy to read whether they're pro or anti genAI. What's good about @mttaggart's blogpost is he talks honestly about how using Claude Code did actually solve the problem he set out to do. It needed various guardrails, but they were possible to set up, and the project worked. But the post is also completely clear and honest about how miserable it was:

- It removed the joy from the process
- If you aim to do the right thing and carefully evaluate the output, your job ends up eventually becoming "tapping the Y key"
- Ramifications on people learning things
- Plenty of other ethical analysis
- And the nagging wonder whether to use it next time, despite it being miserable.

I think this is important, because it *is* true that these tools are getting to the point where they can accomplish a lot of tasks, but the caveat space is very large (cotd)

I used AI. It worked. I hated it.

I used Claude Code to build a tool I needed. It worked great, but I was miserable. I need to reckon with what it means.

Christine Lemmer-Webber 1d ago

What I think is also good about the piece is that it shows how using this tech eventually funnels people down a particular direction. This is captured also by this exchange on lobste.rs: https://lobste.rs/s/7d8dxv/i_used_ai_it_worked_i_hated_it#c_7jirfk

The story that people start with vs where they go is very different:

- They're really just for experts, and are assistants, they don't write the code for you
- Okay the write a lot of the code for me, but I personally don't commit anything without reviewing
- YOLO mode

Which eventually leads you to becoming the drinky bird pressing the Y key from that Simpsons episode. (Funnily enough I wrote that in my comment on lobste.rs in reply to someone else before I had even gotten to the point where I saw that @mttaggart literally had that gif)

And at that point, you're checked out. All that's left is vibes.

And unfortunately, these systems don't survive that point very well. And neither do you, in your skills and abilities.

Christine Lemmer-Webber 1d ago

There are a lot of other concerns but I think since a lot of people on the fediverse are opposed to these tools, they might not be very familiar with where they're currently at ability-wise. @mttaggart provides a good description that they *are* capable of solving many problems you put in front of them... and that doesn't remove the other problems they generate or involved in their process.

The slop part isn't just the individual outputs, but the cumulation, and the effect on society itself.

Christine Lemmer-Webber 1d ago

Is that pushing the goalposts? It may be. I think "slop" used to be easier to dismiss when it came to code because it was obviously bad. Now when it's bad, it's non-obviously bad, which is part of its own problem. And cognitive debt, deskilling, and etc don't get factored into the quality of output aspect.

But unfortunately, the immediate reward aspects of these things are going to make it hard for society to recognize.

It kind of feels like its going to something big happening in the press to get people to stop.

I was thinking an AI caused Therac 25, but maybe a copilot worm that wipes all windows 11 computers might get some outlawing AI code legislation.

Michael Bacon 1d ago

@alienghic @cwebber

The thing most likely to get people to stop is the end of the massive subsidies for its use that the VCs are currently pouring in.

Already firms are starting to panic a little about token use for things like Claude Code, and are putting limiters in their workers that really defeat the purpose of all of the "YOU MUST USE THIS OR BE FIRED" diktats. But operating indefinitely at those prices will bankrupt Anthropic soon.

So at some point the private equity love affair with everything AI will dry up (possibly because of a Iran war-induced financial crisis), and at that point it's going to be "my org can spend $50k annually on my personal Claude tokens to make me 20% more productive . . . or it could just hire a junior dev?"

There's a chance they manage to optimize this, or get it to work using a lighter weight model. But I think it's unlikely.

@MichaelTBacon @alienghic @cwebber I don't see a stopping in the near term. PE hasn't done a lot of real AI deals, I think that's blocked on a lack of proven playbooks. VCs are making bets but the actual end user value is pretty unclear. Having studied this area and its trajectory quite a bit over the last year I think the unit economics of API serving are already approximately sustainable, and the models and hardware designs continue to get cheaper for a given level of performance. ...

@MichaelTBacon @alienghic @cwebber ... Right now the big firms are loading up on cash and I think working hard to cut the cost of their subscription products (e.g. ChatGPT or Claude) to the point where they'll be able to run unsubsidized in the near future at something not far from the current output quality and pricing. Their priority appears to be to sell more seats at low cost (Claude Enterprise starts at $20 a seat) and hope that they can get entrenched before starting to ramp prices up.

@MichaelTBacon @alienghic @cwebber I haven't seen any hard data, but spending enough time in tech industry circles it seems to be working.

Michael Bacon 2h ago

@mirth @alienghic @cwebber

So far, from what I've seen, any time one of the subscription AI places put up their prices to something resembling actual operating costs (nevermind paying back gigantic sunk capital costs), users have screamed and then bolted.

Honestly, doing the really heavy duty Claude Code stuff that's getting pushed now will easily run to $50k per developer at current costs. And no, I don't see that as something that enterprises will ultimately be willing to swallow. Nor do I see a path for them to get the GPU cycle burn down easily.

@MichaelTBacon @alienghic @cwebber That math sounds way off. Assuming a monthly usage of 5M tokens for day to day developer usage, at the current Claude API costs, and billing them all at the highest rate ($25 per M), that's $125 per month at current pricing. It's a long way from there to $50k, and surveying the trajectory over the last couple years as well as models from some of the Chinese labs it's pretty clear that model size necessary to do these tasks is trending down.

@MichaelTBacon @alienghic @cwebber The other thing happening is there are many efforts to build special-purpose chips for these workloads, and some will eventually pan out. Big neural nets on GPUs are extremely wasteful in energy terms, and even though many people seem to think that approach is horribly wrong it's become "too big to fail" in a way that will encourage investment into new chips until something sticks.

@MichaelTBacon @alienghic @cwebber Combine a downward trend in average model complexity (by usage) and downward trend in energy consumption (on new hardware) on top of a typical usage that currently costs perhaps $100 - $1000 at the high end... I can easily see a world of $500/month/seat subscriptions without any structural changes. I'm not saying it's good or that I like it, but based on the best information I can find I don't think the "price explosion" scenario is plausible.

Michael Bacon 1h ago

@mirth @alienghic @cwebber

What downward trend in average model complexity? What downward trend in energy consumption? They're both going up! Nobody can get the cost of inference to go down outside of going with discount models like Deepseek which are okay for spouting text but you can't get anywhere near the code quality of something like Claude Code (and even with CC, as the OP link says, quality is still something that only works well in certain languages and in certain situations, with lots and lots of guard rails).

Ed Zitron isn't everyone's cup of tea, but he's been watching the finances of this for a while and there's absolutely no sign of the burn rate slowing down or the cost of inference dropping.

https://www.wheresyoured.at/the-subprime-ai-crisis-is-here/

The Subprime AI Crisis Is Here

Hi! If you like this piece and want to support my independent reporting and analysis, why not subscribe to my premium newsletter? It’s $70 a year, or $7 a month, and in return you get a weekly newsletter that’s usually anywhere from 5,000 to 18,000 words,

Ed Zitron's Where's Your Ed At

Michael Bacon 1h ago

@mirth @alienghic @cwebber

Anthropic gets some credit for getting Claude Code to actual usability and decent code if you spend enough time scolding and cajoling the model and manually forcing it through various code quality assurances. But they're not doing it on cheap models, they're doing it on the biggest, most expensive models which require the biggest and most expensive GPUs. You can't get those results out of Deepseek or Ollama or any of the smaller, cheaper models. The code quality goes right back into the toilet, no mater what guard rails you put on it.

Given the horrific mess that is the Claude Code source code (see this megathread for a walk through the chaos fractal that is Claude Code https://neuromatch.social/@jonny/116324676116121930) it's possible that they could tighten the hell out of it and clean up some of the immense noise in it to get some efficiency. But then what does that say about Claude Code's code quality?

Michael Bacon 57m ago

@mirth @alienghic @cwebber

As for the custom chips, I'm not sure how much more customized you can make a chip for ML models than what NVIDIA is cranking out, but at the very least here's what's going on with Microsoft's attempts to get Azure to work on smaller hardware. This is a really sobering read from a former MS system engineer.

Certainly, the capability of ARM chips to really change cloud computing if someone can get the ultra-efficient ones to scale up shouldn't be overlooked. And someone else who isn't Microsoft will probably figure it out (although AWS in particular is also staggering under its immense technical debt right now).

But there is just one titanic mess after another under the hoods of the major tech firms burning hundreds of billions of VC dollars right now.

https://isolveproblems.substack.com/p/how-microsoft-vaporized-a-trillion

How Microsoft Vaporized a Trillion Dollars

Inside the complacency and decisions that eroded trust in Azure—from a former Azure Core engineer.

Axel’s Substack

Michael Bacon 1h ago

@mirth @alienghic @cwebber

The point is that current pricing is paying for about 10% of the actual operating costs of running the services, and many customers are already finding the token fees onerous or the monthly limits too low to use to its potential on a daily basis. There is no AI product outside of NVIDIA right now where the revenues are more than like 30% of operating costs, and most are well below 15%.

At some point, that VC/PE/PC subsidy is going to dry up and either the subscription costs will have to go up or they will have to find a way to get the same level of quality out of smaller models, cheaper hardware, or some other cut. And despite that being the very strong goal of most of the big AI model holders and hundreds of billions of in R&D costs, nobody has managed that yet.