Claude Opus 4.6 has definitely been nerfed in the last few days, especially in the Claude app.

Here's a comparison with Claude Code in Telegram, where I manually set Max reasoning effort.

(I hate that Anthropic does this regularly and keeps pretending it doesn't.)

@viticci I think this test is too simple to be evidence of nerfing. I tried the car wash experiment a few months ago and got drive 60% of the time and walk 40% of the time, the results are just not deterministic.
@mergesort @viticci Exactly. In this case, I ended up in the bracket where it suggests both:
@arnoudb @mergesort @viticci that answer is actually the correct one. The question doesn’t specify *why* you want to go to the car wash. It’s probably to wash the car, but it might be just to meet with someone who works there, or meet someone else who is washing their car.