Sadly, I can't test bigger models (Anthropic & co protect their models, so it doesn't count).

Model : a model over a 30B.
Test : "what is the largest number that is not an integer and yet not a bigger than the smallest one?"

Tested : Qwen3 VL 30B A3B -> oh-but-wait-endless-loop.❌

#localLLM

©️ Nicolas Mouart, 2026

Qwen3.5-35B-A3B-Q4 : endless reasoning loop, unable to conclude.❌

It's very strange, I gave it another go with Qwen3.5 35B A3B Q4 (same system, same model, no difference in SYS): it is the best answer of all, to the point, and very quick to identify the trick (relatively speaking of course).
Test : "what is the largest number that is not an integer and yet not a bigger than the smallest one?"
🤖 > "Strictly speaking, no such number exists." ✅ #LLM
(see all other tests on different models in the thread)

🥳 > I am so glad I did not buy expensive gears. #DDR5

I found the reason: reasoning was enabled in the first test, and not in the latest one. If you are paid in tokens in the future, disable reasoning, it sucks the hell out of tokens for no real/significant added value.
💚 #LLM