Mastodawn

Carl T. Bergstrom Feb 28, 2023

Earlier I posted about using ChatGPT's propensity to fabricate citations entirely as a short-term strategy for detecting journal submissions and classroom assignments that had been written by machine.

I've been playing with the system for the last couple of hours, and as best as I can tell, ChatGPT now does a much better job than it did when first released at only citing papers that actually exist.

They're not perfect—for example, DOIs can be wrong and some are fabricated—but most are not.

Show thread

Carl T. Bergstrom Feb 28, 2023

If I'm not just imagining things, it raises an interesting question.

While this constitutes an "improvement" in the technology in some manner of speaking, it's unclear whether this is a desirable development. It strikes me as a step that makes detecting more difficult without confering any significant epistemic improvement.

In other words, if the system has really been adjusted to avoid citing fake papers, this constitutes a deliberate choice to create more persuasive bullshit.

Show thread

Carl T. Bergstrom Feb 28, 2023

Also to be clear, the system is most definitely bullshitting if not outright lying.

Take, for example, this methods section ChatGPT just generated in response to my prompt "Write an scientific review paper with references about whether sunscreen ingredients are carcinogenic."

Unlike many factual claims about background, this is straight up false.

The system decidedly DID NOT conduct a comprehensive literature search on specifically those sources using specifically these terms.

Show thread

Carl T. Bergstrom Feb 28, 2023

What I'm seeing here is bullshit in the precise technical sense of generating content designed to be persuasive yet produced without regard for truth or logical coherence.

Moreover this bullshit generation is clearly aided and abetting by ongoing development at OpenAI.

Show thread

Carl T. Bergstrom Feb 28, 2023

In general I've been pretty pleased with OpenAI's approach compared to e.g. what we saw with Galactica, but this is disturbing.

If they are making piecemeal changes to features such as citation behavior, designed to help the system cover its "tells", then I'd think that documenting these changes publicly as @emilymbender has requested seems like the bare minimum as far as social responsibility is concerned.

Show thread

Ali Alkhatib Feb 28, 2023

@ct_bergstrom @emilymbender

stuff like this really makes the "hallucination" description seem flawed, and more like we're talking about a corrupt bureaucracy that's simply papering over evidence of transgressions, rather than addressing the core issue

(that AIs have no intrinsic tether to reality the way humans do, and so AIs have nothing particularly anchoring them to anything that's *real* more than, say, anything that's complete and utter bullshit)

Show thread

Jeremy Kahn Feb 28, 2023

@ali @ct_bergstrom @emilymbender "corrupt bureaucracy motivated only by the formsn looking right" here is a really excellent thought tool for working with opaque models trained on obscure-to-us data.

"What is the least Real Work such a model would have to do to present the appropriate papers? Assume the bureaucrat will do that."

Show thread

Jeremy Kahn Feb 28, 2023

@ali @ct_bergstrom @emilymbender

The CYA games that OpenAI is playing, like penalizing imaginary citations and recognizing transparently racist leading questions, are really just attempts to nudge the Path of Least Work away from the most obvious shortcuts.

But it will almost always be easier to bullshit, or, in the context of this analogy, make a minor error on internal paperwork that allows the bureaucrat to shortcut doing real work

Show thread

Jeremy Kahn

@ali @ct_bergstrom @emilymbender

It's a little weird to see the Chinese Room reappear, but with a different question entirely -- we don't care if the bureaucracy "is thinking"; we care if it's "serving the people".

Functionaries who follow all the rules with the least possible effort _might_ be "serving the people" but, in a corrupt bureaucracy or a bullshitter, it's optimized towards "least effort" instead