The constant mental vigilance in a generative world is exhausting.

"I asked Claude to do $thing and it did this!"

No it didn't. No you didn't. Probably none of that happened.

And somehow, being unwilling to admit the thing is just making stuff up is annoying and unnecessary, not the damn model.

@mttaggart

I can't get people to understand that the "hallucination" problem is unsolvable because "hallucination" is how it works. That's all it does. Next tokens based on the whole previous series of tokens that represent "the conversation" being had between prompts and responses combined with the hidden prompts that give the thing its flavor. The fact that it is "right" isn't part of it. That's why they never say, "I don't know". They don't know anything. They are literally making it up every single time. It's why they are so expensive and why they are ruining the environment. There is no recall, no memory, no "knowing". As I've seen it said elsewhere, "there is no 'there' there". It's worse than the Chinese Room thought experiment because at least that produces correct responses. This creates the illusion of a correct response. We are killing the earth and building an inescapable surveillance state around technology that will never get any better than it is right now.

@jrdepriest @mttaggart Y'all really think it's *that* different from how humans work? People make overconfident statements about shit they don't really have 'knowledge' of all the time (see this toot/this thread/social media et al.). Sure, it's nOn-DetERMinIsTic—so are we, that's kind of the point.

Doesn't mean it's not extremely useful in certain contexts.

@zero_gravitas @jrdepriest You're arguing a point about utility I did not make. There's no doubt generative text can be "useful" in some contexts.

But the process matters here.

If you think the model's generation process is indistinguishable from human thought, we won't agree and that's that. I do think it's quite different from how humans work. If you do see a distinction, then the distinction is material when users impute the properties of one to the other.

Does the model evaluate the truth of a statement before it returns the output? Can it? If the answer is no, we have a divergence.

@mttaggart @jrdepriest Fair. For the record, I don't think it's indistinguishable from human thought. People seem more focused though on how it diverges from programatic input/output—which is I think the wrong framing all together. I agree that it's a bad idea to anthropomorphize, though that's certainly not a problem limited to how people talk about large language models.