Apparently there is a study that confirms exactly what I've been saying this whole time, but with a twist.

That LLMs hallucinating is a fundamental problem and it can not be fixed no matter how advanced their models get. Their best bet would be to make the model respond something like "I don't know" when they don't have an answer. But apparently they train their models so that they are unable to say things like "I don't know the answer".

This is because if they did, they fear people would stop using AI because they'd find it useless.

I still firmly believe the problem is in the nature of the technology used and not the fact that they are trained to lie. These algorithms were designed to predict things, not to know things. They predict an output based on an input. And if you give them a question they will predict an answer that sounds like an average human answer to that question. If your question is an average question then you will likely get an accurate, average answer. But the more your quesiton deviates from that average, the more likely the answer will also deviate from the average, accurate answer.

https://arxiv.org/abs/2509.04664
Yeah, this other article sums up exactly what I was saying.

"In a landmark study, OpenAI researchers reveal that large language models will always produce plausible but false outputs, even with perfect data, due to fundamental statistical and computational limits."

"The researchers demonstrated that hallucinations stemmed from statistical properties of language model training rather than implementation flaws. The study established that “the generative error rate is at least twice the IIV misclassification rate,” where IIV referred to “Is-It-Valid” and demonstrated mathematical lower bounds that prove AI systems will always make a certain percentage of mistakes, no matter how much the technology improves."

But of course, you read, "Market is already adapting". WHY!? Why does the market have to adapt to the biggest scam ever created!? If it's unreliable, STOP USING AI. PERIOD.

https://www.computerworld.com/article/4059383/openai-admits-ai-hallucinations-are-mathematically-inevitable-not-just-engineering-flaws.html
@enigmatico the worst part is looking under the hood
Cheerleaders: "(My) LLM is a truth engine"
Reality: The LLM is a
rhetoric engine, trained on both Reddit and 4chan posts (yes... trained on 4chan!). Imagine trusting a bunch of tiny Ben Shapiros in your computer...
@enigmatico i find it very strange that confidence is almost never attempted to be shown in any way. an LLM bullshitting because RL hammered in that it's better to say absolute nonsense is virtually impossible to tell from an LLM stating obvious, common truths
@enigmatico i wonder if it could be inferred from how likely certain tokens were to be picked. hallucinations often have the model providing a different answer with each new seed, while truthful answers are more or less stable
@halva I don't know exactly how the LLMs work but at their core, I think they use a very large stochastic matrix. Of course there is more around them and they are complicated machines, but yes. It's just pretty much that. A very big matrix with words and weights.

So yeah, the more common an input is, the more weight into the words that form a valid output are. But the less common an input is, the less likely to get an accurate response is. It's not a pure stochastic matrix because that would behave more like a Markov chain, there is way more than that, but the core of it is one.
@enigmatico
me when i make my project worse because im afraid people will stop using it