“Elegant and powerful new result that seriously undermines large language models”

Like I’ve been saying for a while now: LLMs do not think or reason. They are not on the path to AGI. They are extremely limited correlation and text synthesis machines. https://garymarcus.substack.com/p/elegant-and-powerful-new-result-that

Elegant and powerful new result that seriously undermines large language models

Wowed by a new paper I just read and wish I had thought to write myself. Lukas Berglund and others, led by Owain Evans, asked a simple, powerful, elegant question: can LLMs trained on A is B infer automatically that B is A? The shocking (yet, in historical context, see below, unsurprising) answer is no:

Marcus on AI
@baldur reminds me of an experiment comparing the intelligence of human toddlers against chimpanzees. One crucial difference was the sophistication of their internal model of the world. You give an L-shaped wooden block to both and ask them to balance it on the long end. No problem. Then you cheat and hide a weight in the short end, so that balancing becomes impossible. The chimpanzee will just keep trying indefinitely. The human tries once and then starts examining the block.

@BuschnicK @baldur

The belief that humans are special and fundamentally different from everything else in the universe, rather than just a more powerful, more complex version of the things that already exist in the universe, is this millennium's version of the geocentric model of the world. We aren't special. We are and incredible demonstration of the power of complexity and what amazing effects can arrive from simple rules in complex systems. But at the end of the day, we are just remarkable collections of unremarkable star stuff...

Same as a chimpanzee, same as an nvidia A400 processor.

@danbrotherston @baldur I don't disagree. I do believe that AGI is possible and that there is no magic sauce that makes us special. However, working with the current generation of LLMs on a daily basis also drives home the point that we are still a long, long distance away from matching even chimpanzees with our silicon "brains".

@BuschnicK @baldur

I mean...that's reasonable.

I wouldn't make a specific claim about how far we are from AGI without an actual definition of AGI.

But I do think that a lot of people dismiss LLMs as "not intelligent" because they "simply regurgitate rearrangements of things they've heard before" without considering the nature of human intelligence. In my opinion we simply don't know enough about how human minds actually work to say that isn't how we function. (Not saying you--Soren--did say that, although Balder seems to).

That said, we know, LLMs differ from human intelligence in a number of ways. Specifically they lack a physical experience of the world, and they also lack continuity of experiences and a self narrative. But I rarely see these arguments given as a reason LLMs are different from humans, and I also don't know that they'd be required for AGI...again, that's a very poorly specified concept.

@danbrotherston @baldur well, at minimum it requires:

- a way of interacting with the environment to run experiments

- planning

- a model of the world and way of identifying and dealing with conflicting information

- a notion of how confident they are about statements

- not always going with the first/most likely response

All they currently are is probabilistic text completion engines. That's already useful. But falls short of what I'd consider AGI.

@danbrotherston @baldur Deepmind does interesting research into these questions by putting their agents into virtual worlds / games. I think that's a good approach to address many of these shortcomings. But again, a long way to go yet.

One interesting litmus test: when do we overcome the curse of recursion and instead of LLMs tainting training data actually improve it? I.e. LLMs learning from LLMs with actually positive feedback loops? Works for some limited use cases, but not generally yet.