Mastodawn

samizdis Mar 31

Anthropic: Claude Code users hitting usage limits 'way faster than expected'

https://www.theregister.com/2026/03/31/anthropic_claude_code_limits/

Anthropic admits Claude Code users hitting usage limits 'way faster than expected'

: Unexpected quota drain prompts complaints, breaks automated workflows

The Register

Show thread

jdefr89 Mar 31

Over reliance on LLMs is going to become such a disaster in a way no one would have thought possible. Not sure exactly what, who, when, or where.. Just that having your entire product or repo dependent on a single entity is going to lead to some bad times…

Show thread

xnx

> on a single entity

Contrary to the popular opinion here, there are other services beyond Claude Code. These usage limits might even prompt (har har) people to notice that Gemini is cheaper and often better.

Show thread

bigbinary Mar 31

On-premise LLMs are also getting better and likely won’t stop; as costs go up with the technical improvements, I would imagine cost saving methods to also improve

Show thread

horsawlarway Mar 31

I still think it's basically unavoidable that most people who might pay for api access will end up on-prem.

Fixed costs, exact model pinning, outage resistant, enshittification resistant, better security, better privacy, etc...

There are just so many compelling reasons to be on-prem instead of dependent on a 3rd party hoovering up all your data and prompts and selling you overpriced tokens (which eventually they MUST be, because these companies have to make a profit at some point).

If the only counterbalance is "well the api is cheaper than buying my own hardware"...

That's a short term problem. Hardware costs are going to drop over time, and capabilities are going to continue improving. It's already pretty insane how good of a model I can run on two old RTX-3090s locally.

Is it as good as modern claude? No. Is it as good as claude was 18 months ago? Yes.

Give it a decade to see companies really push into the "diminishing returns" of scaling and new models... combined with new hardware built with these workloads in mind... and I think on-prem is the pretty clear winner.

Show thread

bigbinary Mar 31

These big players don’t have as big of a moat as they like to advertise, but as long as VC wants to subsidize my agents, I’ll keep paying for the $20 plan until they inevitably cut it off

Show thread

earlyriser Mar 31

Gemini is not better on the quotas: https://discuss.ai.google.dev/t/quota-limit-for-pro-plan/130...

Quota limit for Pro plan?

I think this is one of the worst decisions the Antigravity team has ever made. The AI credits don’t even work lol. I have 1000 AI credits but get the below error

Google AI Developers Forum

Show thread

ikidd Mar 31

Last time I used Gemini I watched it burn tokens at three times the rate of any other models arguing with itself and it rarely produced a result. This was around Christmas or shortly after.

Has that BS stopped?

Show thread

DefineOutside Mar 31

It's still not uncommon for it to escape it's thinking block accidentally and be unable to end it's response, or for it to call the same tool repeatedly. I've watched it burn 50 million tokens in a loop before killing the chat.

Show thread

kakugawa Mar 31

gemini-cli has not been useable for weeks. The API endpoint it uses for subscription users is so heavily rate-limited that the CLI is non-functional. There are many reports of this issue on Github. [1]

1/ https://github.com/google-gemini/gemini-cli/issues?q=is%3Ais...