Mastodawn

Prof. Emily M. Bender(she/her)

This is infuriating and also was totally predictable. Thank you @daveyalba for the reporting.

https://www.bloomberg.com/news/articles/2023-05-01/ai-chatbots-have-been-used-to-create-dozens-of-news-content-farms

A few reactions in thread:

AI Chatbots Have Been Used to Create Dozens of News Content Farms

A new report documents 49 new websites populated by AI tools like ChatGPT and posing as news outlets

Bloomberg

Show thread

Prof. Emily M. Bender(she/her)May 1, 2023

There is actually no way to "train" an LLM, run as a text synthesis machine, to not generate fake news -- short of watermarking the output (and it's not yet clear whether this is possible).

Show thread

Prof. Emily M. Bender(she/her)May 1, 2023

Google is being particularly squirrely here. It's like the want to leave open the possibility that synthetic text could EVER be a reasonable source of information.

Really bad look for a company ostensibly "organizing the world's information" -- surely that project is only hampered when the information ecosystem is polluted with unending supplies of synthetic text.

Show thread

JR May 1, 2023

@emilymbender Great prole in NYT today by Cade Metz on Geoffrey Hinton , a lead AI developer at Google. It see,s Mr. Hinton has doubts about his life’s work. Working on the problem since the 1970’s he now states that bad actors can easily be imagined using this technology for nefarious ends. Enjoy it, these are the good old days.

Show thread

ChookMother 🇦🇺🦘May 3, 2023

@Csosorchid @emilymbender Illustrates my thoughts about techies getting so absorbed in and besotted with the technology that they don't consider the negative consequences. This guy took 40 years to see what could go wrong.

Show thread

JR May 3, 2023

@anne_twain @emilymbender Imagine how bad it must be. He worked in this area from its very start. A company he started was bought by Google, so he must have cashed in. Then he works for Google with one of his colleagues leading the project, and him retiring from some sort of corporate emeritus position.
That guy now thinks it possible his life’s work will lead to something very bad. That guy is smart, he is probably right.

Show thread

ChookMother 🇦🇺🦘May 4, 2023

@Csosorchid @emilymbender There are different kinds of smart. Some people have less ability to see the social consequences of things.

Show thread

Matunos May 1, 2023

@emilymbender what an indirect and weird way for Crovitz to say that publishers should continue to rely on humans to produce news stories

Show thread

Gavin Chait;May 1, 2023

@emilymbender I feel like we're going to end up where "we" (writers, creators, journalists, researchers) have to escrow or validate our work-in-progress because it's impossible to compel watermarking the generative stuff.

Show thread

Word of Mouth

May 1, 2023

@GavinChait
YyyyyyyuuuuuuuuuP. As always it becomes a problem for the 'victim'.
@emilymbender

Show thread

Janne May 2, 2023

@GavinChait @emilymbender having a version history works very well, at least for technical writing.

Generating a plausible proof of work with a LLM would not be very difficult, though.

You would need a hours-long video of yourself sitting and writing. Generating a deep fake video (or even making one with clever editing) would not be too difficult, though...

The cross section of pulp , mechanical turk ghost writer, and a language model is not small..

Claiming authorship is not trivial.

Show thread

Gavin Chait;May 2, 2023

@janvenetor @emilymbender Which is where escrow comes in. We'll need a while new industry of trusted proof-of-work auditors.

Show thread

Word of Mouth

May 1, 2023

@emilymbender
YyyyyyyuuuuuuuuuP. How does one "watermark" plain text?

Short answer: one doesn't.

Show thread

paulmather007 May 2, 2023

@notroot @emilymbender I read about a scheme where the LLM deliberately uses unlikely words at regular intervals. The text is still “sensical ” but a human would be unlikely to do that, so it serves as a watermark. (As long as you don’t edit the text.) I don’t know if this is actually a good scheme because I’m not an expert, but it seemed possible?

Show thread

Word of Mouth

May 2, 2023

@paulmather007 @emilymbender I mean... I'm sure there's some grifters using unaltered chatbot output that could be detected that way, but the minute that's implemented, some Macedonian teen will write a script to strip the "se Socal" and we're back to undetectable plain text.

The problem is "how to watermark plain text", and there's just no solution to that. Period. At all. It can always be stripped.

Show thread

paulmather007 May 2, 2023

@notroot @emilymbender Well, the scheme requires LLMs to, in good faith, impliment the watermark. (Meant to say “sensical,” ironically autocorrect messed me up.) Yes, of course you could strip the watermark. But since people are already turning in essays without reading them that start with “as AI, I can’t write an essay” I suppose there’s a low bar and a watermark would be helpful in some circumstances.

Show thread

paulmather007 May 2, 2023

@notroot @emilymbender I mean the thing about any kind of watermark is it can be stripped. They’re a speed bump.

Show thread

Word of Mouth

May 2, 2023

@paulmather007 @emilymbender You're absolutely right IMO... a speed bump is better than nothing. And, yup... the Genie and the Lamp are forever parted company. Now that "we" know how to make LLMs, "we" will make them until "we're" bored.

We really do live in a state of existential anarchy. Human "laws" are just patterns emerging in the chaos, no more binding than a speed limit less than C.

Show thread

Word of Mouth

May 2, 2023

@paulmather007 @emilymbender If I was a philosopher, I'd probably say "Hobbes Was Right". We'll go to any lengths to rationalize away the incontrovertible fact that we're all -- the Earth is -- insignificant specks in a unimaginably vast cosmos. We'll do anything to escape existential angst. Any social order that puts humanity at the center is better than facing the fact that *we have ALWAYS lived in anarchy*. That's why we make governments. We don't like it.

Show thread

Eric McCorkle May 1, 2023

@emilymbender if I understand LLMs correctly (I am not an AI person), they can in theory be used as a steganographic channel. (This understanding hinges on them essentially being an elaborate expander from some entropy source.)

If that's the case, then it should also be possible to embed a watermark in the output.

What's questionable is whether or not the bitrate of the steganographic channel is high enough to embed enough in a typical sized output to be meaningful.

Show thread

Nlroundabout May 1, 2023

@emilymbender I despair.