On the limits of LLMs (Large Language models) and LRMs (Large Reasoning Models). The TL;DR: "Our findings reveal fundamental limitations in current models: despite sophisticated self-reflection mechanisms, these models fail to develop generalizable reasoning capabilities beyond certain complexity thresholds." Meaning: accuracy collapse.

Interesting paper from Apple. https://ml-site.cdn-apple.com/papers/the-illusion-of-thinking.pdf

#AI #LLM #LRM

Further proof that "AI" behaves just like a sealioning reply guy ;) Even if told what the working solution is, it simply ignores it.

"We uncover surprising limitations in LRMs’ ability to perform exact computation, including their failure to benefit from explicit algorithms and their inconsistent reasoning across puzzle types."

Idem, Page 3

@jwildeboer this was the most important finding for me. I’ve seen it have a lot of trouble following instructions, and like many, I blame myself for not “prompting right.” Seeing it fail to follow instructions even in very controlled situations really helps set expectations better.

@cocoaphony @jwildeboer But... of course it can't follow instructions; it isn't "learning" anything while in operation. The P in ChatGPT stands for "Pre-trained" - once it's running nothing new is added to the system; it just keeps running on the same recipe, no matter what input it receives.

Which of course means that even when it seems to do so, it's gaslighting. Pretending. Simulating. Lying, as we used to call it.

@jwildeboer @tonyarnold I'm sure that had nothing at all to do with Apple's complete ineptitude to even be remotely competitive in the LLM space.