@fivetonsflax @SystemsAppr @ricci The #IEEE #802.1aq rollup spec definition of a Two-Port MAC Relay #TPMR is really quite insidious and bothersome to follow. I consider it a litmus test for how much an LLM will actually #HALLUCINATE when faced with a human-written specification. Calling everything a switch... Danger, Will Robinson, special cases abount. #OpenBSD notably has its own software stub driver to model this class of network element.

1. #Authors shouldn't be using #LLMs to create #reference #lists, bcz LLMs #hallucinate often.

2. Every #science #journal should run authors' ref lists through the @retractionwatch Database. https://retractiondatabase.org/RetractionSearch.aspx

Doing both things would improve #science #trustworthiness.

Retraction Watch Database

Ouch. Sullivan & Cromwell, over 900 attorneys, has a face full of egg. Letter of apology, hearing today w/ judge re: fake cases (hallucinated #citations).
Are #attorneys so concerned w/ speed that they forget their duty to the #court? Yes.
If a #FirstYear handed them this research (especially a woman), they'd check it. But because it comes from #TechBros, it gets a pass.
#GenAI #hallucinate #cite #cases #Judge #bar #fine #censure #disbarment #legal #LawFirm #NYC
https://news.bloomberglaw.com/business-and-practice/sullivan-cromwell-apologizes-to-judge-for-ai-hallucinations
Sullivan & Cromwell Apologizes to Judge for AI Hallucinations

A Sullivan & Cromwell lawyer apologized to a bankruptcy judge for filing documents with incorrect case citations caused by artificial intelligence.

#MissKittyRaw @[email protected], while you #messianically #hallucinate your way into the grave and half the world with you, you cannot wonder why #productivity #has #cratered. Everybody sitting around wondering why they should bother doing anything since you're going to kill the world. πŸ˜΅β€πŸ’«πŸ€―πŸ˜±
#MissKittyArtWalk #AI #Research Don't use #fast mode if you really want to be accurate. I had to go back for the third time on the prompt to get actual active accounts. Unbelievable. The willingness to just #hallucinate #wildly. I'm going to hide back in the #thinking mode later LOL.

A cool test of how much different #AI models #hallucinate: the #BullshitBenchmark

The #Claude and #Qwen models seem to push back more when confronted with nonsensical questions. The #OpenAI models do not fare well.

Blog post: https://adam.holter.com/bullshitbench-v2-claude-and-qwen-are-the-only-models-that-push-back/
Results: https://petergpt.github.io/bullshit-benchmark/viewer/index.v2.html

#LLM

BullshitBench v2: Claude and Qwen Are the Only Models That Push Back - Adam Holter

BullshitBench v2 is out. Peter Gostev tested 70+ model variants across 100 questions spanning coding, medical, legal, finance, and physics. The benchmark measures one specific thing: whether a model will push back against a plausible-sounding but factually wrong statement, or just go along with it. Only two model families score meaningfully above 60% on bullshit […]

Adam Holter

A Large Language Model (LLM) is a deep-learning algorithm, often using a transformer architecture, that is trained on massive amounts of text data to understand, process, and generate human-like text.

A major shortcoming of LLMs is their tendency to "#Hallucinate" or confidently generate false or nonsensical information, along with the risk of perpetuating #Biases present in their training data.

https://knowledgezone.co.in/trends/browser?topic=Language-Model

Language Model

A language model is an AI system trained on vast amounts of text to understand and generate human-like language. It predicts the probability of word sequences, enabling applications like chatbots and text generation.

Knowledge Zone

You know you're doomed when your operating system vendor is selling their "#AI" fetish to you with a text like this.

Β»Agentic AI has powerful capabilities todayβ€”for example, it can complete many complex tasks in response to user prompts, transforming how users interact with their PCs. As these capabilities are introduced, AI models still face functional limitations in terms of how they behave and occasionally may #hallucinate and produce unexpected outputs. Additionally, agentic AI applications introduce novel #security #risks, such as cross-prompt injection (XPIA), where malicious content embedded in UI elements or documents can override agent instructions, leading to unintended actions like data #exfiltration or #malware installation.Β«

https://support.microsoft.com/en-us/windows/experimental-agentic-features-a25ede8a-e4c2-4841-85a8-44839191dfb3

Experimental Agentic Features - Microsoft Support

#OpenAI Says Models Programmed To Make Stuff Up Instead of Admitting Ignorance - Slashdot

#AI models often produce false outputs, or "hallucinations." Now OpenAI has admitted they may result from fundamental mistakes it makes when #training its models. The Register:
The admission came in a paper [PDF] published in early September, titled "Why Language Models #Hallucinate ,"
#llm #hallucinations #artificialintelligence

https://slashdot.org/story/25/09/17/1724241/openai-says-models-programmed-to-make-stuff-up-instead-of-admitting-ignorance?utm_source=rss1.0mainlinkanon&utm_medium=feed

OpenAI Says Models Programmed To Make Stuff Up Instead of Admitting Ignorance - Slashdot

AI models often produce false outputs, or "hallucinations." Now OpenAI has admitted they may result from fundamental mistakes it makes when training its models. The Register: The admission came in a paper [PDF] published in early September, titled "Why Language Models Hallucinate," and penned by thr...