Please, please do not ever respond to a post with an answer like “ChatGPT says…” or “Claude told me…”. It is very rude.

It is wrong. These tools can’t answer questions; they mash words around, and will make up nonsense. When the machine does it, it’s just gibberish, but by posting it, you’re turning it into a lie, and anyone who posts or repeats it without attribution will turn it into disinformation.

It wastes time. Now everyone has to fact-check you instead of researching the question.

If you want, you can ask an LLM and then do your *own* fact checking of its answer, using clues you find in its output. This can be helpful, as sometimes the phrasing you think of for a search won’t be the right magic words, and an LLM can help you find those. What these tools are doing is more like “automatic free association” than “answering questions”. They can be a useful *tool* for answering questions, but the way you use them is critically important.
Using an LLM to discover search terms or get inspiration for research strategy which you can *independently verify* is like using a frying pan to cook a delicious frittata of facts. Posting the answers that it gives you as useful information that is true is like heating up the frying pan and then sticking your tongue directly on the pan in an attempt to eat it. The LLM output is the heat, not the food. Do not eat it.
@glyph I think "automatic free association" is one of the best descriptions of LLMs I've read. Thanks for that.
@kingrat thanks, this is a phrase I’d love to popularize :-). The difference between “brainstorming” and “bullshit” is largely a matter of context and intent rather than content. More than half of the problems with these things are dishonest marketing and the attendant incorrect user expectations of what the tools are doing.
@glyph somehow we've discovered the only response more annoying than "let me google that for you"

@ldmoray LMGTFY is at least understandable when people are clogging up a support channel with requests for easily-discovered information that more or less proves they’re just asking volunteers to do their homework for them. Still rude, still a bad idea, and it’s definitely metastasized into being an obnoxious quip from people who have no idea what the results are or whether they answer the question. But in some cases it’s an understandable frustration.

“I asked an LLM” is always wrong.

@glyph @wwahammy Hey now, no need to take it out on people named Claude. They can answer questions from time to time 😅
@philip @glyph sounds like something a Claude would say! 😂
@glyph ChatGPT agrees with your point here.
@glyph
I prefer to know a response comes from a language model. So let people say that.

@glyph
the way i think of LLMs is as pattern translation programs. they might be able to see patterns and do a very impressive job at it.

but this doesn't mean that they can know things, only change one thing into another

@glyph honestly, I'd prefer people do that and tag their nonsense up front.
@wwahammy it had not even occurred to me that people would do this and lie about it too, but I guess if those are the choices, sure.
@glyph he doesnt deserve this
@glyph "chatgpt said" has about as much weight to me as "some random dude in my city said"

@glyph I agree that it's rude and bad to do this, but GPT-4 has a high enough hit rate IME that this part seems like a stretch:
> These tools can’t answer questions; they mash words around, and will make up nonsense.

They definitely can answer questions. With RLHF, that is specifically what they're designed/trained to do, and they're pretty good at it in many domains. But, posting the answer without checking it is, as you say, either lying or bullshit.

@objectObject also while the marketing claims are that it’s more factual and reliable, academic literature does not seem to bear that out as far as I’ve seen. For a recent example, https://arxiv.org/pdf/2307.09009.pdf

I’ve seen it do okay at *parsing* tasks, where it’s only responsible for interpreting input rather than producing output. Still not 100% reliable but if you can check its work it doesn’t seem too bad. A “calculator for words” if you can structure your expectations appropriately

@glyph I'm not claiming it's highly accurate. But it is a tool that is specifically trained for question answering, and IME works well for many domains (though you should only use it if you can check the answer). Using them effectively is a matter of trading off time spent checking vs time spent searching/synthesizing.

The claims that LLMs "just" autocomplete text ignore most of the work that went into productionizing them.

@glyph for that paper specifically, my guess is that OpenAI is optimizing against several objectives (speeding up inference and reducing offensive output) and accuracy has suffered as a result.
@glyph another possibility that matches my experience is that its output is becoming more hesitant and uncertain over time, leading to fewer answers (correct or incorrect).