Mastodawn

Andreas Ganske Jun 7, 2025

On the limits of LLMs (Large Language models) and LRMs (Large Reasoning Models). The TL;DR: "Our findings reveal fundamental limitations in current models: despite sophisticated self-reflection mechanisms, these models fail to develop generalizable reasoning capabilities beyond certain complexity thresholds." Meaning: accuracy collapse.

Interesting paper from Apple. https://ml-site.cdn-apple.com/papers/the-illusion-of-thinking.pdf

Further proof that "AI" behaves just like a sealioning reply guy ;) Even if told what the working solution is, it simply ignores it.

"We uncover surprising limitations in LRMs’ ability to perform exact computation, including their failure to benefit from explicit algorithms and their inconsistent reasoning across puzzle types."

Idem, Page 3

Show thread

Rob Napier Jun 7, 2025

@jwildeboer this was the most important finding for me. I’ve seen it have a lot of trouble following instructions, and like many, I blame myself for not “prompting right.” Seeing it fail to follow instructions even in very controlled situations really helps set expectations better.

Show thread

JWcph, Radicalized By Decency Jun 7, 2025

@cocoaphony @jwildeboer But... of course it can't follow instructions; it isn't "learning" anything while in operation. The P in ChatGPT stands for "Pre-trained" - once it's running nothing new is added to the system; it just keeps running on the same recipe, no matter what input it receives.

Which of course means that even when it seems to do so, it's gaslighting. Pretending. Simulating. Lying, as we used to call it.

Show thread

🇺🇦 Maksim Lin 💙Jun 8, 2025

@jwildeboer @tonyarnold I'm sure that had nothing at all to do with Apple's complete ineptitude to even be remotely competitive in the LLM space.