Mastodawn

If true, #hallucinations cast serious doubt on whether the end goal of #AGI can be achieved with today’s #LLM architectures and training methods.

While ongoing research explores #RAG and hybrid models and inference techniques, no implementation to date has fully eliminated flawed reasoning.

What consumer would trust mission-critical decisions if an AGI is known to confidently state falsehoods?

https://www.newscientist.com/article/2479545-ai-hallucinations-are-getting-worse-and-theyre-here-to-stay/

#GenAI #AI

AI hallucinations are getting worse – and they're here to stay

An AI leaderboard suggests the newest reasoning models used in chatbots are producing less accurate results because of higher hallucination rates. Experts say the problem is bigger than that

New Scientist

Show thread

bazkie 👩🏼‍💻 bitplanes 🎵May 10

@chrisvitalos haven't read the article yet - but don't actual humans also do this all the time? don't we kinda 'hallucinate' a model of reality?

Show thread

Chris Vitalos May 10

@bazkie

I think we humans have biases and cognitive considerations that influence our reasoning and the conclusions we make.

Show thread

Chris Vitalos May 10

@bazkie

Reflecting on this more, humans can evaluate the benefits and risks of hallucinating in the unique human social context (impact to one's credibility for example).

Consider cases of embarrassed and discredited lawyers who submitted legal briefs riddled with factual errors.

AI, in its current state, doesn't value the outcomes in this context -- because to it there is no gain or cost for it to do so.

I'd imagine AGI would need this human like ability.

https://apnews.com/article/artificial-intelligence-chatgpt-fake-case-lawyers-d6ae9fa79d0542db9e1455397aef381c

Lawyers submitted bogus case law created by ChatGPT. A judge fined them $5,000

A federal judge has imposed $5,000 fines on a group of lawyers after ChatGPT was blamed for their submission of fictitious legal research to support an aviation injury claim. Judge P. Kevin Castel said the lawyers acted in bad faith but credited their apologies in a written ruling Thursday. The lawyers testified earlier this month that they thought references to past cases in a document they submitted to Castel were real. They actually were made up by the artificial intelligence-powered chatbot. Separately, the judge tossed out the aviation claim, saying the statute of limitations had expired.

AP News

Show thread

bazkie 👩🏼‍💻 bitplanes 🎵May 10

@chrisvitalos I wanted to read the articles and think about it some more but I'm really tired, so I don't have anything useful to say about it now, sorry! thanks for the insights tho :)