Claude Opus 4.6 has definitely been nerfed in the last few days, especially in the Claude app.

Here's a comparison with Claude Code in Telegram, where I manually set Max reasoning effort.

(I hate that Anthropic does this regularly and keeps pretending it doesn't.)

Also saw some folks suggest switching back to Opus 4.5 and yep, whatever it is they've done to Opus 4.6, the older version isn't affected.
@viticci Damn, seriously? Fingers crossed code isn’t affected.
@oliverames @viticci Code is definitely affected
@viticci Maybe it assumed you work at the car wash… 😋
@viticci is this like when apple slows down my iphone before a new one comes out

@viticci Both make assumptions as to why you're going to the car wash. It also gave me a different response. Not that it matters, I have no car so the drive option is always wrong (I have gone to a car wash despite this too).

If you ask if the reason you're going matters, then it says you obviously need to drive if you're washing your car so that seems fairly reasonable, imo.

@viticci At this point, I only use Opus 4.6 when I need to search the web (since its web search tool, and generally its tool calling, are markedly improved). Opus 4.5 is such a great model *and* doesn’t get nerfed by Anthropic whenever they want.
@viticci it’s a confusing question tbh. If a user is asking the question it presupposes that the car is not a necessity for the activity. Kinda like asking “should I walk or take my bike to the velodrome”
@mattbrowndev @viticci it certainly can make an actual intelligent being question back “do you need to wash your car or just go there?” But that is not what the LLM does, because it does not think.
@viticci I think this test is too simple to be evidence of nerfing. I tried the car wash experiment a few months ago and got drive 60% of the time and walk 40% of the time, the results are just not deterministic.
@mergesort @viticci Exactly. In this case, I ended up in the bracket where it suggests both:
@arnoudb @mergesort @viticci that answer is actually the correct one. The question doesn’t specify *why* you want to go to the car wash. It’s probably to wash the car, but it might be just to meet with someone who works there, or meet someone else who is washing their car.

@mergesort @viticci Exactly. LLMs are non-deterministic.

1. ML engineers don't bother getting all the math to be identical across all runtime scenarios.
2. They intentionally add 'temperature' to make it a bit more random.

@viticci companies and individuals are, once again, self putting shackles to become slaves to the tech companies, I guess we don’t learn
@viticci read this post while listening to Sacred Realms talk about TotK 🤯