Apparently there is a study that confirms exactly what I've been saying this whole time, but with a twist.
That LLMs hallucinating is a fundamental problem and it can not be fixed no matter how advanced their models get. Their best bet would be to make the model respond something like "I don't know" when they don't have an answer. But apparently they train their models so that they are unable to say things like "I don't know the answer".
This is because if they did, they fear people would stop using AI because they'd find it useless.
I still firmly believe the problem is in the nature of the technology used and not the fact that they are trained to lie. These algorithms were designed to predict things, not to know things. They predict an output based on an input. And if you give them a question they will predict an answer that sounds like an average human answer to that question. If your question is an average question then you will likely get an accurate, average answer. But the more your quesiton deviates from that average, the more likely the answer will also deviate from the average, accurate answer.
https://arxiv.org/abs/2509.04664
That LLMs hallucinating is a fundamental problem and it can not be fixed no matter how advanced their models get. Their best bet would be to make the model respond something like "I don't know" when they don't have an answer. But apparently they train their models so that they are unable to say things like "I don't know the answer".
This is because if they did, they fear people would stop using AI because they'd find it useless.
I still firmly believe the problem is in the nature of the technology used and not the fact that they are trained to lie. These algorithms were designed to predict things, not to know things. They predict an output based on an input. And if you give them a question they will predict an answer that sounds like an average human answer to that question. If your question is an average question then you will likely get an accurate, average answer. But the more your quesiton deviates from that average, the more likely the answer will also deviate from the average, accurate answer.
https://arxiv.org/abs/2509.04664


