Apple did the research; LLMs cannot do formal reasoning. Results change by as much as 10% if something as basic as the names change.
https://garymarcus.substack.com/p/llms-dont-do-formal-reasoning-and
Apple did the research; LLMs cannot do formal reasoning. Results change by as much as 10% if something as basic as the names change.
https://garymarcus.substack.com/p/llms-dont-do-formal-reasoning-and
@elexia @ShadowJonathan Because OpenAI and co. marketed them as it. And many, many people swallowed it raw.
The technological realities of LLM have always been secondary in this AI rush, it's all been about marketing, and Oh boy has it worked.
It will take a long time and a lot of demonstrations, including the simplest ones, for the world to understand their mistakes and pull out of this impasse. Or the next revolutionary, distributive technology in the Silicon Valley cycle will hijack all the fund.
Now, it's a matter of proving that the ad was misleading, and that is harder than creating a pretty beautiful lie.
This is not surprising at all and I don't understand why anyone had to waste time and resources on demonstrating a self-evident fact that was known before the research even started.
Yes, the problems posed to the #LLMs in this study are mathematical in nature or logic problems ā why are systems that are trained to produce text expected to produce any meaningful results here?
Yes. That's why so many people called it out for the lie that it is. LLMs are nothing like their marketing. They are not even AI. It's nothing but autocorrect powered by stolen intellectual property and enough energy to destroy our planet. So yeah, very advanced autocorrect (for an unacceptably high price) but not even slightly resembling AI.
@ShadowJonathan Maybe now that *big tech company* not *independent, well-respected researchers* has told companies what they need to know, maybe they'll actually listen?
Maybe?
@ShadowJonathan itās really weird that some people are pushing LLM as something that can reason, while its architecture is Key-Value storage with sophisticated probabilistic query and value encoding mechanisms.
LLMs just donāt have enough layers for anything besides queries, so it canāt have any relational capabilities that allow to make multi step decisions.
Also tokenization hides a lot of structure of the language from encoding process, which adds additional source of errors.
Iām sure we can build something that can reason at some point, but it requires very diffirent and more complex architecture.
@anderspuck @ShadowJonathan because they're being sold as if they can solve complex tasks
LLMs can use a prompt to generate text based off of a huge pile of content produced by other people. Sometimes that text is an exact copy of the original text. They may "solve" a problem if the solution is contained in their training data and your prompt is able to retrieve it.
They're a (very) improved version of a Markov chain. Not a problem solver of any sort
@jwcph @dalias @ShadowJonathan I used an LLM to create a first draft of the transcript here, for example. Without that help there just wouldnāt be any transcript because it would take too much time. So that for me is definitely in the category of āusefulā.
https://www.logicofwar.com/why-did-experts-fail-to-predict-russias-invasion-of-ukraine/
Hello, In this video, I discuss why so many experts failed to accurately predict the Russian invasion of Ukraine in 2022. Most experts at the time were saying that it was very unlikely that Russia would invade Ukraine. Of those who did foresee an invasion, many dramatically overestimated the capabilities
@anderspuck @dalias @ShadowJonathan Sure - now all you have to figure out is how much you'd pay for that usefulness, because this is only happening to become an extremely lucrative business for somebody.
(no, that's not a different topic; the problem complex here is functionality + usefulness + environmental impact + business model)
@anderspuck @dalias @ShadowJonathan
LLMs are NOT doing *speech to text* translation -- doing transcripts from audio (podcast). That's a different set of AI technologies.
The industry has been developing "AI" technologies since before I was born. Some are quite useful.
It's the "Generative AI" subset (which includes LLMs, chatbots) that is so misleading, mostly useless, and incredibly wasteful.
@dalias @ShadowJonathan @anderspuck no, never reliable enough. This stems from how they are designed.
They are incapable of asking for help if they donāt understand a passage, for example, writing down something hallucinated* instead.
*) Iām aware that this is not a good term to use for this but I donāt have a better one handy before coffee.
@anderspuck because they're expected to solve complex tasks, they're being sold as if they can solve complex tasks, and that they have a fail and error rate enough that they're not safe
They want these things to drive cars and make decisions that involve human lives.
@anderspuck @ShadowJonathan No, it doesn't matter what kind of energy they're consuming, because energy always has a cost to produce, and again the cost-to-benefit ratio isn't there. LLMs are creating scarcity for relatively little actual positive benefit.
It's also not strictly about power; the same arguement applies to water consumption as well.
@kasperd @anderspuck @ShadowJonathan
Dittos! I was about to post the same thing.
The industry has been developing "AI" technologies since before I was born. Many work quite well, and are useful. Some save money. Some save lives.
You probably interact with "traditional" AI systems far more often than you realize.
Each has to be evaluated based on its costs and benefits and risks.
Generative AI / LLMs Chatbots are a dangerous wasteful SCAM.
Self-driving cars are still "iffy."
Self-driving cars are iffy, but human driven cars are dangerous. A self driving car might already be safer than one driven by a human.
The hard question is what will people choose if they are given the choice between two accidents that can be blamed on human drivers or one accident with a self-driving car where there isn't anyone to blame.
Companies like OpenAI and their defenders claim generative AI can reason, learn, etc. We know itās nonsense, but itās still extremely important it gets called out.
@nf3xn @rubenerd @graue @halva @ShadowJonathan I doubt Hinton is lying although heās probably wrong. Thereās a problem in philosophy: is the mind separate from the body? If itās not, then it should be possible to model the brain well enough to simulate thought processes (at least in principle.)
Computational physics tells us that there is a function that could perform the simulation and Hintonās career is looking for it.
@dalias @graue @halva @ShadowJonathan Youād think that people who own a bicycle can just checkā¦
On a tangentially related note, flying bicycles are invented by future humanity āThe Dark Forestā personal flying vehicles in the form of helicopter backpacks. Theyāre ābicyclesā in the sense that theyāre two counter-rotating coaxially-mounted propellers. Thatās actually not a bad idea. If only we poured billions of dollars into making that work.
@enoch_exe_inc @dalias @graue @halva
> Youād think that people who own a bicycle can just checkā¦
does the emperor have no clothes? would people call him out on it?