With #Galactica and #ChatGPT I'm seeing people again getting excited about the prospect of using language models to "access knowledge" (i.e. instead of search engines). They are not fit for that purpose --- both because they are designed to just make shit up and because they don't support information literacy. Chirag Shah and I lay this out in detail in our CHIIR 2022 paper:

https://dl.acm.org/doi/10.1145/3498366.3505816

>>

Situating Search | Proceedings of the 2022 Conference on Human Information Interaction and Retrieval

ACM Conferences
@emilymbender Discovering that a chat bot's data is inconsistent with its sources is often trivial. I've had fun asking GPT-3 questions about the rules of Scrabble, especially around how many tiles there are for each letter, and whether you can spell out certain words with and without blank tiles. The answers are internally inconsistent, don't correspond with actual Scrabble rules, and don't even correspond with the source material that GPT-3 pointed me to when I asked it how it "knew".
@markproxy @emilymbender so there is not any setup for direct citation of sources in these models but you can ask it what it's sources are… but it's possibly making up things there, too?!
@mlncn @emilymbender It's not fabricating sources entirely, but the content of the sources I've seen often doesn't match GPT-3's so-called knowledge.
@markproxy @emilymbender Thanks! This more clearly asked the question in my head, and the answer—the model is not supposed to retain the source material—is elucidating and worrying https://fosstodon.org/@miklo/109462959890443389
💙💛:~/eu/pl/priv$:idle: (@[email protected])

@[email protected] Is this #ChatGPT able to return, as part of its reply, links to the sources on which it based its reply ?

Fosstodon