Apple did the research; LLMs cannot do formal reasoning. Results change by as much as 10% if something as basic as the names change.

https://garymarcus.substack.com/p/llms-dont-do-formal-reasoning-and

LLMs donโ€™t do formal reasoning - and that is a HUGE problem

Important new study from Apple

Marcus on AI
@ShadowJonathan not to sound antiintellectual, but isn't it kinda obvious that a *text* generator, no matter how complex, can't do abstract reasoning?
@halva @ShadowJonathan yeah, I appreciate the demonstrations, but this feels a little like, "New study confirms bicycles cannot fly."

@graue @halva @ShadowJonathan

This is in the context of massive companies spending billions hyping bicycles as viable replacements for aircraft.

It's blindingly obvious it's all a lie, but the hype keeps making it onto the front page and people keep investing in it as if it was true. Airlines are talking about replacing their planes with bikes etc etc.

There are serious discussions (by people who should really know better) about how plane makers are no longer needed because bicycles exist. It makes no sense but there's so much money invested that no one wants to be the one to admit it.

@FediThing @graue @halva exactly this, and research like what Apple just did is basically figure out the lift performance of a bike, see that it doesn't exist, and pointing out for what it is; not a plane

@ShadowJonathan @graue @halva

The question is, what happens when such research conflicts with share price-juicing hype?

Do companies try to damp down the hype for the sake of long term sanity? Or do they go with the hype to get maximum juicing and bury any sceptical voices?