Apple did the research; LLMs cannot do formal reasoning. Results change by as much as 10% if something as basic as the names change.
https://garymarcus.substack.com/p/llms-dont-do-formal-reasoning-and
Apple did the research; LLMs cannot do formal reasoning. Results change by as much as 10% if something as basic as the names change.
https://garymarcus.substack.com/p/llms-dont-do-formal-reasoning-and
@jwcph @dalias @ShadowJonathan I used an LLM to create a first draft of the transcript here, for example. Without that help there just wouldn’t be any transcript because it would take too much time. So that for me is definitely in the category of “useful”.
https://www.logicofwar.com/why-did-experts-fail-to-predict-russias-invasion-of-ukraine/
Hello, In this video, I discuss why so many experts failed to accurately predict the Russian invasion of Ukraine in 2022. Most experts at the time were saying that it was very unlikely that Russia would invade Ukraine. Of those who did foresee an invasion, many dramatically overestimated the capabilities
@anderspuck @dalias @ShadowJonathan Sure - now all you have to figure out is how much you'd pay for that usefulness, because this is only happening to become an extremely lucrative business for somebody.
(no, that's not a different topic; the problem complex here is functionality + usefulness + environmental impact + business model)
@anderspuck @dalias @ShadowJonathan
LLMs are NOT doing *speech to text* translation -- doing transcripts from audio (podcast). That's a different set of AI technologies.
The industry has been developing "AI" technologies since before I was born. Some are quite useful.
It's the "Generative AI" subset (which includes LLMs, chatbots) that is so misleading, mostly useless, and incredibly wasteful.
@dalias @ShadowJonathan @anderspuck no, never reliable enough. This stems from how they are designed.
They are incapable of asking for help if they don’t understand a passage, for example, writing down something hallucinated* instead.
*) I’m aware that this is not a good term to use for this but I don’t have a better one handy before coffee.