Mastodawn

Carl T. Bergstrom Dec 6, 2022

At this point it will surprise no one, but I asked #ChatGPT to define bullshit and to cite its sources.

It provided definitions from the Cambridge English Dictionary and the Merriam-Webster Dictionary.

The definitions it provided were entirely reasonable, but they were decidedly not from the sources it claimed.

This highlights the fact that ChatGPT and other LLMs are not knowledge models, they are themselves engines trained to produce convincing bullshit.

Below: ChatGPT, CED, MW.

Show thread

Goon Docks Dec 6, 2022

@ct_bergstrom It would be interesting to see what it could do if you could also feed the sources. For example, if you loaded it with all the textbooks and other sources used in a classroom, would that improve it's abilities when asking about those subjects? Or it does it require larger datasets for its sources to better parse the questions asked?

Show thread

John-Mark Gurney Dec 6, 2022

@goondocks
Are you sure they didn't already do that? I mean it's confident enough to think that it's read those dictionaries, yet clearly "didn't" since it misquoted them.
@ct_bergstrom

Show thread

Goon Docks Dec 6, 2022

@encthenet @ct_bergstrom I meant providing it with a more limited scope to work with. Obviously the fake citation is concerning, but I'm trying to think of some good practical uses. For example, imagine if I could feed a system JUST the texts for my economics class and then ask it questions about economics. Would I get "better" (as in more closely aligned to what is discussed and taught in those books) results when I asked it questions. Now it's using a very broad data source.

Show thread

John-Mark Gurney

@goondocks
Ahh, that make better sense. The problem from my understanding is that it's unlikely to be able to "understand" the language used as well without the rest of the context from the other parts. That is, just economic texts aren't likely enough context.
@ct_bergstrom