@oneloop Most ML looks for patterns in the data that map to a truth function (true/false) where the truth may be objective or based on expert opinions. A well know example would be the looking at patterns in X-rays to detect whether a fracture is or is not present. It is still not going to get the answer correct every time, but it can also sometimes detect patterns not obvious to us.
An LLM however has no truth function, it just looks to autocomplete a prompt you give it (which you may think is a question with a true/false answer) based on the frequency with which words follow each other in the data set.
At best the LLM trainers will get sweated labour in the South to label sources for accuracy, so it may give a higher probability to strings of words in a Wikipedia or Reddit article compared to one in the Onion or conspiracy theories on Facebook or X.
Add to this the important 'chat' feature, where it is programmed to respond in a way that fakes being in a conversation such as you would be having with a human respondent.
And it never responds "I don't know". It always comes up with what looks like an an answer, however poor the source or combination of sources may be.
> An LLM however has no truth function
> LLM trainers will get sweated labour in the South to label sources for accuracy
Ok, if agree that you're training for accuracy then they do have "truth function".
You're mixing technicalities of the technology, with impact of the technology.
> And it never responds "I don't know".
You're mixing properties of the implementations that you've seen, with properties of the technology. Furthermore the premise, that it never responds "I don't know" isn't even true. I'm afraid you're just repeating things that you've heard that haven't taken a minute to consider whether they're true.
Here's Chatgpt saying it doesn't know