Super frustrated with all the cheerleading over chatbots for search, so here's a thread of presentations of my work with Chirag Shah on why this is a bad idea. Follow threaded replies for:

op-ed
media coverage
original paper
conference presentation

Please boost whichever (if any) speak to you.

Chatbots are not a good replacement for search engines

https://iai.tv/articles/all-knowing-machines-are-a-fantasy-auid-2334

All-knowing machines are a fantasy | Emily M. Bender and Chriag Shah

The idea of an all-knowing computer program comes from science fiction and should stay there. Despite the seductive fluency of ChatGPT and other language models, they remain unsuitable as sources of knowledge. We must fight against the instinct to trust a human-sounding machine, argue Emily M. Bender & Chirag Shah.

IAI TV - Changing how the world thinks
Chatbots could one day replace search engines. Here’s why that’s a terrible idea.

Language models are mindless mimics that do not understand what they are saying—so why do we pretend they’re experts?

MIT Technology Review

Chatbots-as-search is an idea based on optimizing for convenience. But convenience is often at odds with what we need to be doing as we access and assess information.

https://www.washington.edu/news/2022/03/14/qa-preserving-context-and-user-intent-in-the-future-of-web-search/

Q&A: Preserving context and user intent in the future of web search

In a new perspective paper, University of Washington professors Emily M. Bender and Chirag Shah respond to proposals that reimagine web search as an application for large language model-driven...

UW News

Chatbots/large language models for search was a bad idea when Google proposed it and is still a bad idea even when coming from Meta, OpenAI or You.com

https://dl.acm.org/doi/10.1145/3498366.3505816

Situating Search | Proceedings of the 2022 Conference on Human Information Interaction and Retrieval

ACM Conferences

Language models/automated BS generators only have information about word distributions. If they happen to create sentences that make sense it's because we make sense of them. But dis-connected "information" inhibits the broader project of sense-making.

https://www.youtube.com/watch?v=VY1GHbU_FYs&list=PLn0nrSd4xjjY3E1qxXpWDoF7q-Q3d6g_A&index=17

Situating Search

YouTube

We must not mistake a convenient plot device — a means to ensure that characters always have the information the writer needs them to have — for a roadmap to how technology could and should be created in the real world.

https://mindmatters.ai/2022/12/why-we-should-not-trust-chatbots-as-sources-of-information/

Why We Should Not Trust Chatbots As Sources of Information

On a deeper note, they say, the pursuit of absolutely certain Correct Information suffers from a fundamental flaw — it doesn’t exist.

Mind Matters
@emilymbender thanks for sharing your insight. An anecdote from my toying around with #chatGPT: I asked it to show me an example of a program written in an imaginary combination of the best from the programming languages #Python, #Julia, #golang, and #Rust.
It wrote me a nice piece of pseudo-code that made sense. Furthermore, it could explain to me which traits represented characteristics from each language. Although it's probably not, it gave me an impression of creativity

@arildsen @emilymbender Well the thing with chatbots like ChatGPT is that they are very good at exactly that: giving you an IMPRESSION that they are good at something.

But they will absolutely lie through their teeth to do it, and it will be believable lies.

@WAHa_06x36 @emilymbender that sounds like a good point, but are you actually lying if you don't KNOW that you are lying?
@arildsen @emilymbender It doesn't really matter, the end result is the same: You get fed believable bullshit, and you either come away from the interaction less informed than you were before, or you spend an long time combing through the result trying to carefully separate out the truth from the fiction.
@WAHa_06x36 @emilymbender @arildsen The problem begins with the linguistically fuzzy insinuation of lying. A Chatbot can only produce results in response to a prompt and cannot lie because reflection or morality or any intention as a consciously acting agent is missing. The result can seem like a lie to us because nonsense is possibly presented like facts.
@cognisize @emilymbender @arildsen Not the point being discussed, though, is it? The question isn’t if it is moral for an AI to lie. It is that an AI will act in a manner indistinguishable from a human lying, which means it is less than useless, and actively harmful.

@WAHa_06x36 @arildsen @emilymbender This is kinda a category error, isn't it. As well-argued here, language models are incapable of producing factual statements, correct or incorrect. They can only produce poetry.

Unfortunately, we lack the language and metaphor to talk about statistical text generators and the human tendency to see peopleness everywhere doesn't help.

Language models can only write poetry

But only a person can write a poem

Allison Posts
@RAOF @WAHa_06x36 @arildsen You're referring to **Gwern**? They openly promote eugenics. Please stay out of my feed with any pointers to them.

@emilymbender @WAHa_06x36 @arildsen Urgh, sorry.

Thanks for the heads up. It's sad that some people have made AI a gateway to that cluster of terrible thinking.

That blog post only refers to a piece of Gwenn's work in the opening couple of paragraphs, as framing.

The author doesn't seem to be in the SSC/Rationalist/scientific racism orbit. Maybe they don't know? (I'll try to contact them)

Thanks again for the heads up.

@RAOF @emilymbender @arildsen Those circles are absolutely packed with eugenicists and scientific racists, there’s no way they will care.
@WAHa_06x36 there has to be someone in AI research who isn't marinated in longtermism 😬
@WAHa_06x36 @RAOF @arildsen AI is rife with it, true, but also lots of folks come across Gwern's stuff and cite it while being unaware of the rest and do appreciate the heads up.
@emilymbender @RAOF @arildsen Oh, I slightly misread the comment I was responding to anyway. Entirely agreed.
@RAOF That is an entirely uninteresting distinction, isn't it. Language models speak to you like a person, and they act like a person that is lying. The fact that this isn't a conscious choice is irrelevant to the actual outcome.

@WAHa_06x36 I think it's quite an important distinction? It's fundamental to how you should interpret text generated by a language model.

If you paint two dots and a downward facing semicircle on a rock, people immediately interpret the rock as being sad - : (

But we all know rocks can't be sad.

Similarly, language models are a really complicated pattern painted on a rock. The text they generate isn't true or false statements; it's randomly generated truthy. Many of the texts it generates will be interpreted as true statements, because lots of truthy strings are representations of true statements.

But saying GPT-3 lies suggests that you could make a language model that doesn't lie, or that isn't cavalier with the truth, and that's the wrong way to think about them.

Everyone knows rocks can't be sad; they don't know that language models can't tell the truth, but it's the same human cognitive failing that generates both.

@WAHa_06x36 I guess a simpler, but incorrectly anthropomorphic, way of saying that is that language models don't lie, they bullshit.
@RAOF I definitely never say that "GPT-3 lies", I say that language models lie. All of them, without exceptions.
@WAHa_06x36 @RAOF I do think it’s pretty important because “I am interacting with a person who lies to me and I may have to cajole the truth out of them” and “I am generating text with a model, but the model may generate things that are not true” leave me with very different conclusions as to how to interact with the model. For example, there’s no real point in trying to get the model to “slip up” like a suspect in a criminal investigation might. You can certainly shape the interaction like that, but then you’re just kind of hamstringing yourself.
@WAHa_06x36 @RAOF If it has to be anthropomorphic, the best I heard when I asked a while back was "It generates an answer that is very much like what a random person on the internet might answer". That is both true and useful insofar as it, as whoever wrote that pointed out, elicits about the right level of source criticism.