Mastodawn

Claude Code is unusable for complex engineering tasks with the Feb updates

https://github.com/anthropics/claude-code/issues/42796

[MODEL] Claude Code is unusable for complex engineering tasks with the Feb updates · Issue #42796 · anthropics/claude-code

Preflight Checklist I have searched existing issues for similar behavior reports This report does NOT contain sensitive information (API keys, passwords, etc.) Type of Behavior Issue Other unexpect...

GitHub

Show thread

summarity 2d ago

Not claude code specific, but I've been noticing this on Opus 4.6 models through Copilot and others as well. Whenever the phrase "simplest fix" appears, it's time to pull the emergency break. This has gotten much, much worse over the past few weeks. It will produce completely useless code, knowingly (because up to that phrase the reasoning was correct) breaking things.

Today another thing started happening which are phrases like "I've been burning too many tokens" or "this has taken too many turns". Which ironically takes more tokens of custom instructions to override.

Also claude itself is partially down right now (Arp 6, 6pm CEST): https://status.claude.com/

Claude Status

Welcome to Claude's home for real-time and historical data on system performance.

Show thread

andoando 2d ago

Ive been noticing something similar recently. If somethings not working out itll be like "Ok this isnt working out, lets just switch to doing this other thing instead you explicitly said not to do".

For example I wanted to get VNC working with PopOS Cosmic and itll be like ah its ok well just install sway and thatll work!

Show thread

mikepurvis 2d ago

That helps explain why my sessions signed themselves out and won't log back in.

Show thread

me_vinayakakv 2d ago

I just experienced this some time ago and could not sign in still.

Their status page shows everything is okay.

Show thread

giwook 2d ago

I think in general we need to be highly critical of anything LLMs tell us.

Show thread

pixel_popping 2d ago

Claude code shows: OAuth error: timeout of 15000ms exceeded

Show thread

giwook 2d ago

Maybe a local or intermittent issue? Working for me.

Show thread

pixel_popping 2d ago

It's a bit insane that they can't figure out a cryptographic way for the delivery of the Claude Code Token, what's the point of going online to validate the OAuth AFTER being issued the code, can't they use signatures?

Show thread

phillipcarter 2d ago

Maybe it's because I spend a lot of time breaking up tasks beforehand to be highly specific and narrow, but I really don't run into issues like this at all.

A trivial example: whenever CC suggests doing more than one thing in a planning mode, just have it focus on each task and subtask separately, bounding each one by a commit. Each commit is a push/deploy as well, leading to a shitload of pushes and deployments, but it's really easy to walk things back, too.

Show thread

toenail 2d ago

I thought everybody does this.. having a model create anything that isn't highly focused only leads to technical debt. I have used models to create complex software, but I do architecture and code reviews, and they are very necessary.

Show thread

matheusmoreira 2d ago

That analysis is pretty brutal. It's very disconcerting that they can sell access to a model then just stealthily degrade it over time, effectively pulling the rug from under their customers.

Show thread

01284a7e 2d ago

Seems like the logical conclusion, no matter what.

Show thread

tmpz22 2d ago

> effectively pulling the rug from under their customers.

This is the whole point of AI. Its a black box that they can completely control.

Show thread

mikepurvis 2d ago

Disconcerting for sure, but from a business point of view you can understand where they're at; afaiui they're still losing money on basically every query and simultaneously under huge pressure to show that they can (a) deliver this product sustainably at (b) a price point that will be affordable to basically everyone (eg, similar market penetration to smartphones).

The constraints of (b) limit them from raising the price, so that means meeting (a) by making it worse, and maybe eventually doing a price discrimination play with premium tiers that are faster and smarter for 10x the cost. But anything done now that erodes the market's trust in their delivery makes that eventual premium tier a harder sell.

Show thread

Aperocky 2d ago

imo cramming invisible subagents are entirely wrong, models suffer information collapse as they will all tend to agree with each other and then produce complete garbage. Good for Anthropic though as that's metered token usage.

Instead, orchestrate all agents visibly together, even when there is hierarchy. Messages should be auditable. Other tools are significantly better at this (e.g. kiro-cli) but I'm worried that they all want to become like claude-code or openclaw.

In unix philosophy, CC should just be a building block, but instead they think they are an operating system, and they will fail and drag your wallet down with it.

Show thread

Asmod4n 2d ago

I’ve tried to use Claude code for a month now. It has a 100% failure rate so far.

Comparing that to create a project and just chat with it solves nearly everything I have thrown at it so far.

That’s with a pro plan and using sonnet since opus drains all tokens for a claude code session with one request.