Mastodawn

Hacker News 1d ago

Hallucinate – Massively Multiplayer Online Rave

https://hallucinate.site

#HackerNews #Hallucinate #MMO #Rave #VirtualEvents #OnlineCommunity #TechInnovation

Hallucinate - Massively Multiplayer Online Rave

Show thread

saxnot [dial "SAXN" at GPN24]5d ago

@rameshgupta

> We never had to invent words like #hallucinate to evade accountability for problems in LLM and AI

the was hallucination was not invented for talking about LLM. the word was well established in the english language long before the current hype

Personally I try to not to anthropomorphize this stochastical process.
Hallucication is a word used for humans and I refuse to sing the song of the AI hype corporations but feel free to do it different

Show thread

Ramesh Gupta 6d ago

⬆️ @saxnot

>> we learned about #GIGO in both statistics and computer science.

We never had to invent words like #hallucinate to evade accountability for problems in LLM and AI software itself. So AI hallucination is GIGO from the other end, not from bad user input.

I chose translation software as an example. They do an admirable job in figuring out parts of speech, but not so well in figuring out figures of speech, eg Godfather makes an "offer you can't refuse."

@avandeursen @TeflonTrout

Show thread

Dr. Bruce Simpson May 2

@fivetonsflax @SystemsAppr @ricci The #IEEE #802.1aq rollup spec definition of a Two-Port MAC Relay #TPMR is really quite insidious and bothersome to follow. I consider it a litmus test for how much an LLM will actually #HALLUCINATE when faced with a human-written specification. Calling everything a switch... Danger, Will Robinson, special cases abount. #OpenBSD notably has its own software stub driver to model this class of network element.

Katharine O'Moore-Klopf, ELS Apr 28

1. #Authors shouldn't be using #LLMs to create #reference #lists, bcz LLMs #hallucinate often.

2. Every #science #journal should run authors' ref lists through the @retractionwatch Database. https://retractiondatabase.org/RetractionSearch.aspx

Doing both things would improve #science #trustworthiness.

Retraction Watch Database

Show thread

getmisch Apr 22

Ouch. Sullivan & Cromwell, over 900 attorneys, has a face full of egg. Letter of apology, hearing today w/ judge re: fake cases (hallucinated #citations).
Are #attorneys so concerned w/ speed that they forget their duty to the #court? Yes.
If a #FirstYear handed them this research (especially a woman), they'd check it. But because it comes from #TechBros, it gets a pass.
#GenAI #hallucinate #cite #cases #Judge #bar #fine #censure #disbarment #legal #LawFirm #NYC
https://news.bloomberglaw.com/business-and-practice/sullivan-cromwell-apologizes-to-judge-for-ai-hallucinations

Sullivan & Cromwell Apologizes to Judge for AI Hallucinations

A Sullivan & Cromwell lawyer apologized to a bankruptcy judge for filing documents with incorrect case citations caused by artificial intelligence.

Miss Kitty 🌈🌈🌈Apr 7

#MissKittyRaw @[email protected], while you #messianically #hallucinate your way into the grave and half the world with you, you cannot wonder why #productivity #has #cratered. Everybody sitting around wondering why they should bother doing anything since you're going to kill the world. 😵‍💫🤯😱

Miss Kitty 🌈🌈🌈Mar 15

#MissKittyArtWalk #AI #Research Don't use #fast mode if you really want to be accurate. I had to go back for the third time on the prompt to get actual active accounts. Unbelievable. The willingness to just #hallucinate #wildly. I'm going to hide back in the #thinking mode later LOL.

Jesus Castagnetto 🇵🇪Mar 3

A cool test of how much different #AI models #hallucinate: the #BullshitBenchmark

The #Claude and #Qwen models seem to push back more when confronted with nonsensical questions. The #OpenAI models do not fare well.

Blog post: https://adam.holter.com/bullshitbench-v2-claude-and-qwen-are-the-only-models-that-push-back/
Results: https://petergpt.github.io/bullshit-benchmark/viewer/index.v2.html

#LLM

BullshitBench v2: Claude and Qwen Are the Only Models That Push Back - Adam Holter

BullshitBench v2 is out. Peter Gostev tested 70+ model variants across 100 questions spanning coding, medical, legal, finance, and physics. The benchmark measures one specific thing: whether a model will push back against a plausible-sounding but factually wrong statement, or just go along with it. Only two model families score meaningfully above 60% on bullshit […]

Adam Holter

Show thread

Knowledge Zone Dec 4

A Large Language Model (LLM) is a deep-learning algorithm, often using a transformer architecture, that is trained on massive amounts of text data to understand, process, and generate human-like text.

A major shortcoming of LLMs is their tendency to "#Hallucinate" or confidently generate false or nonsensical information, along with the risk of perpetuating #Biases present in their training data.

https://knowledgezone.co.in/trends/browser?topic=Language-Model

Language Model

A language model is an AI system trained on vast amounts of text to understand and generate human-like language. It predicts the probability of word sequences, enabling applications like chatbots and text generation.

Knowledge Zone