Apple did the research; LLMs cannot do formal reasoning. Results change by as much as 10% if something as basic as the names change.

https://garymarcus.substack.com/p/llms-dont-do-formal-reasoning-and

LLMs don’t do formal reasoning - and that is a HUGE problem

Important new study from Apple

Marcus on AI
@ShadowJonathan not to sound antiintellectual, but isn't it kinda obvious that a *text* generator, no matter how complex, can't do abstract reasoning?
@halva @ShadowJonathan yeah, I appreciate the demonstrations, but this feels a little like, "New study confirms bicycles cannot fly."
@graue @halva @ShadowJonathan The record for human powered flight was accomplished on what is basically a bicycle with wings and a propeller attached. Some AI researchers believe that they can add the equivalent of wings and a propeller to an LLM and accomplish the equivalent
The technical term is multi-agent model.

@MartyFouts @graue @halva @ShadowJonathan

You got the story wrong, exactly like LLM wrong.

They had wing to fly, but needed speed so they added the bicycle.

With LLM we have text generation, when we will have a reasoning IA, we will add LLM to talk to us.

Like the bicycle that can't fly but can produce speed, LLM can't reason but can talk.

@Aedius @graue @halva @ShadowJonathan I didn’t say anything about how the device evolved, only describing its eventual state. So no, I didn’t get the story wrong.

But I see you do understand the underlying point: there are researchers who are taking the bicycle with wings approach, making the assumption that multi-agent methods will work around LLM limitations.