Apple did the research; LLMs cannot do formal reasoning. Results change by as much as 10% if something as basic as the names change.

https://garymarcus.substack.com/p/llms-dont-do-formal-reasoning-and

LLMs don’t do formal reasoning - and that is a HUGE problem

Important new study from Apple

Marcus on AI
@ShadowJonathan Why would we judge LLMs on their ability to solve complex tasks? The interesting thing is if they can solve simple tasks well enough to be useful.
@anderspuck @ShadowJonathan Which they also can't do.
@dalias @ShadowJonathan They can absolutely do certain things well enough to be useful. Create a fairly accurate transcript of a podcast, for example.

@dalias @ShadowJonathan @anderspuck no, never reliable enough. This stems from how they are designed.

They are incapable of asking for help if they don’t understand a passage, for example, writing down something hallucinated* instead.

*) I’m aware that this is not a good term to use for this but I don’t have a better one handy before coffee.