“Elegant and powerful new result that seriously undermines large language models”

Like I’ve been saying for a while now: LLMs do not think or reason. They are not on the path to AGI. They are extremely limited correlation and text synthesis machines. https://garymarcus.substack.com/p/elegant-and-powerful-new-result-that

Elegant and powerful new result that seriously undermines large language models

Wowed by a new paper I just read and wish I had thought to write myself. Lukas Berglund and others, led by Owain Evans, asked a simple, powerful, elegant question: can LLMs trained on A is B infer automatically that B is A? The shocking (yet, in historical context, see below, unsurprising) answer is no:

Marcus on AI
@baldur And now people are arguing that this isn’t really all that bad because under specific circumstances, humans can make similar mistakes, e. g. when they are distracted or when their „training“ happened long ago etc. 😬. I would disagree, because I think we would (for example) expect employees in call centers - even those on the lowest levels - to not commit that type of mistake while doing the work they are being paid for.
@stefanieschulte Right. And a lot of people are actively confusing recall (being able to remember a fact) and the ability to reason about the facts presented to them.
@baldur @stefanieschulte
Thus revealing that they didn't even attempt to read the paper.
It describes how the authors had designed their experiments (using ficticious data expressed in different "directions") such that the issue of famous son vs not-so-famous mother was clearly not what was hampering the LLMs ability to generate correct answers.