It’s 2026. Why Are LLMs Still Hallucinating? – Duke University Libraries

Artificial Intelligence, Digital Scholarship, Duke researchers, Instruction, Library Hacks

It’s 2026. Why Are LLMs Still Hallucinating?

January 5, 2026 Hannah Rozear 1 Comment

Way back in spring 2023, we wrote about the emergence of ChatGPT on Duke’s campus. The magical tool that could help, “Write papers! Debug Code! Create websites from thin air! Do your laundry!” By 2026, AI can do most of those things (well…maybe not your laundry). But one problem we highlighted back then persists today: LLMs still make stuff up.

When I talk to Duke students, many describe first-hand encounters with AI hallucinations – plausible sounding, but factually incorrect AI-generated info. A 2025research study of Duke students found that 94% believe Generative AI’s accuracy varies significantly across subjects, and 90% want clearer transparency about an AI tool’s limitations. Yet despite these concerns, 80% still expect AI to personalize their own learning within the next five years. It’s hard to throw the baby out with the bathwater when these tools can break down complex topics, summarize dense course readings, or turn a messy pile of class notes into a coherent study outline. This tension between AI’s usefulness and its unreliability raises an obvious question: if the newest “reasoning models” are smarter and more precise, why do hallucinations persist?

 Below are four core reasons.

1. Benchmark tests for LLMs favor guessing over IDK

You’ve probably seen the headlines: the latest version of [insert AI chatbot here] has aced the MCAT, crushed the LSAT, and can perform PhD-level reasoning tasks. Impressive as this sounds, many of the benchmark evaluation tests for LLMs reward guessing over acknowledging uncertainty – as explained in Open AI’s post, Why Language Models Hallucinate. This leads to the question: why can’t AI companies just design models that say “I don’t know”? The short answer is that today’s LLMs are trained to produce the most statistically likely answer, not to assess their own confidence. Without an evaluation system that rewards saying “I don’t know” models will default to guessing. But even if we fix the benchmarks, another problem remains: the quality of the information LLMs train on is often pretty bad.

2. Training data for LLMs is riddled with inaccuracies, half-truths, and opinions

The principle of GIGO (Garbage In, Garbage Out) is critical to understanding the hallucination problem. LLMs perform well when a fact appears frequently and consistently in its training data. For example, because the capital of Peru (=Lima) is widely documented, an LLM can reliably reproduce that fact. Hallucinations arise when the data is more sparse, contradictory, or low-quality. Even if we could minimize hallucination, we’d be relying on the assumption that the underlying training data is trustworthy. And remember: LLMs are trained on vast swaths of the open web. Reddit threads, YouTube conspiracy videos, hot-takes on personal blogs, and evidence-based academic sources all sit side-by-side in the training data. The LLM doesn’t inherently know which sources are credible. So if a false claim appears often enough (ex. The Apollo moon landing was a hoax!) – an LLM might confidently repeat it, even though the claim has been thoroughly debunked.

3. LLMs aim to please (because that’s what we want them to do)

When ChatGPT-4o launched, OpenAI was quickly criticized for the model’s unusually high level of sycophancy. AI sycophancy being the tendency an LLM has to validate and praise users even when their ideas are pretty ridiculous (like the now-famous soggy cereal cafe concept). OpenAI dialed back the sycophancy, but the incident revealed something fundamental: LLMs tell us what we want to hear. Because these systems learn from human feedback, they’re reinforced to sound helpful, friendly, and affirming. They’ve learned that people prefer a “digital Yes Man.” After all, if ChatGPT wasn’t so validating, would you really keep coming back? Probably not. This tension between the very behaviors that make them fun to use can make them overconfident, over-agreeable, and more prone to innacuracies or hallucination. 

4. Human language (and how we use it) is complicated

LLMs are excellent at parsing syntax and analyzing semantics, but human communication requires much more than grammar. In linguistics, the concept of pragmatics refers to how context, intention, tone, background knowledge, and social norms shape meaning. This is where LLMs struggle. They don’t truly understand implied meanings, sarcasm, emotional nuance or unspoken assumptions. LLMs use math (or statistical pattern matching) to predict the probable next word or idea. When that educated guess doesn’t align with the intended meaning, hallucinations may be more likely to occur.

Example to illustrate how linguistic meaning and literal meaning could be challenging for an LLM to interpret:

TL; DR – So … why are LLMs still hallucinating? 

  • They’re evaluated using benchmarks that reward confident answers over accurate ones.
  • They’re trained on internet data full of contradictions, misinformation, and opinions.
  • They’re reinforced (by humans) to be friendly and engaging – sometimes to a fault.
  • They still can’t grasp the contextual, messy nature of human language.

AI will keep improving and getting better, but trustworthiness isn’t just a technial problem – it’s a design, data, and human-behavior problem. By understanding how LLMs work, staying critically aware of their limitations, and double-checking anything that seems off, you’ll strengthen your AI fluency and make smarter use of the technology.

Want to boost your AI fluency? Check out these Duke resources: 

References & Further Reading

  • Duke University. (2025). AI student surveyConducted by Duke students, Barron Brothers (T ’26) and Emma Ren (T ’27).

On AI sycophancy:

On AI hallucination: 

On pragmatics in LLMs:

Special thanks to Brinnae Bent, Mary Osborne, and Aaron Welborn for reviewing the post!

One thought on “It’s 2026. Why Are LLMs Still Hallucinating?”

  • Nano Banana AI January 5, 2026 at 7:01 pm The tension you describe between students’ reliance on AI and their distrust ofLLM Hallucinations in 2026 its accuracy feels spot-on. What strikes me most is how much these hallucinations stem from the way we test and reward models—benchmarks push them to guess confidently instead of acknowledging uncertainty. It seems like the real progress will come not just from smarter models, but from reshaping the incentives we use to measure them. Reply
  • Continue/Read Original Article: https://blogs.library.duke.edu/blog/2026/01/05/its-2026-why-are-llms-still-hallucinating/

    Original article: View source

    Tags: AI, AI Daydreaming, artificial intelligence, Confident Guessing, Duke University, Duke University Libraries, Facts not GIGO, Hallucinations, Measuring Success, Models for AI, Science
    #AI #AIDaydreaming #artificialIntelligence #ConfidentGuessing #DukeUniversity #DukeUniversityLibraries #FactsNotGIGO #Hallucinations #MeasuringSuccess #ModelsForAI #Science

    Betty Boop, Panna Marple i „Na Zachodzie bez zmian” za darmo. Dzień domeny publicznej 2026 w USA

    Wraz z nadejściem nowego roku, tysiące dzieł kultury z 1930 roku traci w Stanach Zjednoczonych ochronę majątkowych praw autorskich. Do domeny publicznej trafiają ikony kina, literatury i animacji, w tym pierwsze powieści o Nancy Drew oraz debiutanckie występy psa Pluto.

    Jak co roku, Center for The Study of The Public Domain przy Duke University opublikowało listę dzieł, które 1 stycznia 2026 roku przechodzą do domeny publicznej w USA. Oznacza to, że mogą być one legalnie kopiowane, udostępniane i adaptowane bez konieczności uzyskiwania zgody czy wnoszenia opłat.

    Literatura: od Faulknera do Agathy Christie

    Rok 1930 był niezwykle płodny dla literatury światowej. Od 1 stycznia w USA uwolnione zostają m.in.:

    • William Faulkner, Kiedy umieram (As I Lay Dying).
    • Agatha Christie, Morderstwo na plebanii (The Murder at the Vicarage) – pierwsza powieść, w której pojawia się panna Marple.
    • Dashiell Hammett, Sokół maltański (The Maltese Falcon).
    • Carolyn Keene, Nancy Drew – pierwsze cztery tomy serii, zaczynając od The Secret of the Old Clock.
    • Zygmunt Freud, Kultura jako źródło cierpień (Civilization and Its Discontents).

    Ikony popkultury: Betty Boop i Pluto

    Do domeny publicznej trafiają także pierwotne wersje słynnych postaci animowanych. W 1930 roku w kreskówce Dizzy Dishes zadebiutowała Betty Boop (choć w tej wczesnej wersji posiadała jeszcze psie uszy).

    Wolny od praw autorskich staje się również pies Rover (później przemianowany na Pluto), który pojawił się w filmach Disneya The Chain Gang oraz The Picnic. Należy jednak pamiętać, że nowsze wersje tych postaci nadal podlegają ochronie, a ich nazwy i wizerunki są chronione prawami do znaków towarowych.

    Kino: Dietrich, Garbo i John Wayne

    Wśród filmów, które stają się dobrem wspólnym, znajdują się prawdziwe klasyki:

    • Na Zachodzie bez zmian (All Quiet on the Western Front) – zdobywca Oscara za najlepszy film.
    • Błękitny anioł (The Blue Angel) oraz
    • Maroko (Morocco) z Marleną Dietrich.
    • Anna Christie – pierwszy film dźwiękowy z Gretą Garbo.
    • Droga olbrzymów (The Big Trail) – pierwsza główna rola Johna Wayne’a.
    • Sucharki w kształcie zwierząt (Animal Crackers) z Braćmi Marx.

    Muzyka: kompozycje i nagrania

    W przypadku muzyki sytuacja jest dwutorowa. Do domeny publicznej trafiają kompozycje muzyczne z 1930 roku, takie jak I Got Rhythm Gershwinów czy Georgia on My Mind Hoagy’ego Carmichaela. Dodatkowo, na mocy ustawy Music Modernization Act, uwolnione zostają nagrania dźwiękowe sprzed 100 lat, czyli z 1925 roku.

    Obejmuje to historyczne rejestracje utworów w wykonaniu m.in. Bessie Smith, Louisa Armstronga czy Marian Anderson.

    Ważne zastrzeżenie: USA a Europa

    Należy pamiętać, że powyższa lista dotyczy prawa amerykańskiego, gdzie ochrona autorska trwa zazwyczaj 95 lat od publikacji. W Polsce i większości krajów Unii Europejskiej obowiązuje zasada „70 lat od śmierci twórcy”. Oznacza to, że niektóre dzieła wolne w USA mogą być wciąż chronione w Europie (np. utwory W. Somerseta Maughama, który zmarł w 1965 roku, w UE będą chronione do 2036 roku). Z drugiej strony, dzieła autorów zmarłych w 1955 roku są już w domenie publicznej w Europie, niezależnie od daty publikacji ich konkretnych utworów.

    Z muzyką (nagraniami) jest jeszcze inaczej. Podczas gdy Amerykanie cieszą się właśnie z uwolnienia nagrań Louisa Armstronga z 1925 roku, w Polsce (i krajach UE) ochrona wygasła w 1995 roku, po 70, a nie 100 latach jak w USA.

    #AgathaChristie #BettyBoop #Disney #domenaPubliczna #DukeUniversity #filmy #książki #Kultura #Pluto #prawaAutorskie #PublicDomainDay #usa
    #DukeUniversity is a hedge fund with a health system, a huge research enterprise & a hobby of college https://dontaylor13.substack.com/p/duke-university?utm_medium=ios
    Duke University

    is a hedge fund with a health system, a huge research enterprise & a hobby of college

    Professor Taylor's Two Ditches Substack

    Duke Univ. - Coeds November, 1947.
    1 photograph : color transparency ; 4 x 5.

    Title: The Toni Frissell Photograph Collection
    Date: 1940s-1960s

    Description:
    The Toni Frissell Photograph Collection, comprising black and white and color photographs, spans several decades of her career. The collection was created during Frissell's tenure from 1947 to 1960.

    Frissell (1907-1988) was an American photographer known for her work in fashion and celebrity portraits. She worked as a staff photographer for the Vogue magazine and was one of the first female photographers to break into the male-dominated industry.

    Locations mentioned in the collection include New York City, which was likely Frissell's primary location for shooting many of her photographs during this period.

    #DukeUniv #Frissell #American #Vogue #first #NewYorkCity #ToniFrissells #DukeUniversity #unitedstates #photography

    https://www.loc.gov/pictures/item/2021748798/

    R.E.M., We Are Having a Heavenly Time!, 1985 on P.F.M.

    Bootleg recording from R.E.M.’s September 26th, 1984 gig at Durham North Carolina’s Page Auditorium (on Duke’s campus).

    Also available in the Internet Archive – R.E.M. Live – 1984-09-26 Page Auditorium.

    It was recorded by the mobile unit of Reflection Sound Studios and then bootlegged, with this notation on the rear cover: “This Fan Club album is not intended for sale, commercial distribution or air play. All rights reserved. Entire packaging and contents copyright 1985 P.F.M. Records a Division of S.B.W. (Special Products). Cover printed in the U.S.A.”

    Reportedly R.E.M. issued a cease-and-desist primarily because this implied it was produced by the official fan club.

    My copy via Worcester Record Riot.

    #1980s #1985 #bootleg #dukeUniversity #pageAuditorium #rEM #reflectionSound #vinyl #vinylcollection #vinylfinds #worcesterMa #worcesterRecordRiot

    re Marine researchers find biggest source of microplastics in our ocean is vehicle tyres (via) -- obviously EVs won't solve this (even if they do reduce microplastics caused by brake-wear)...

    ...which just re-kindles my rage at #DukeUniversity for nuking the #lightRail plan here in #DurhamNC at the very last possible minute, when it was too late to make any changes to address their "concerns".

    This is what we could have had by now (or at least nearing completion).

    We've got to do better.

    Marine researchers find biggest source of microplastics in our ocean is vehicle tyres

    It was found broken down tyres were the largest source of microplastics in New Zealand's coastal waters.

    RNZ
    Researchers surprised that with AI, toxicity is harder to fake than intelligence

    New “computational Turing test” reportedly catches AI pretending to be human with 80% accuracy.

    Ars Technica