Mastodawn

JBW Feb 1, 2023

So OpenAI just released a detector of AI-generated text, I assume because of concerns in education / homework.

https://openai.com/blog/new-ai-classifier-for-indicating-ai-written-text/

Maybe this is good?

No, it's very bad.

They claim 26% true positives, 9% false positives. Assume 10% of submitted homework is chatgpt generated, you get the classic counterintuitive outcome of poor predictive power: if a homework is flagged, there's a 3:1 chance it's *human* generated.

This is going to cause a lot of harm. It should be immediately recalled.

New AI classifier for indicating AI-written text

We’re launching a classifier trained to distinguish between AI-written and human-written text.

Show thread

Ben Adida Feb 1, 2023

I should clarify that I find ChatGPT very interesting, and I'm not in the camp of those who dismiss it. I think it has great value and will continue to grow in value.

But where it gets very tricky is in the area of facts and truth. Not so good. And when plagiarism detection is the task, it's all facts and truth. And that's precisely this technology's Achille's Heel at the moment

Show thread

Jules 🍺Feb 1, 2023

@ben prefer catGPT myself 😊

Show thread

Michael Lachmann Feb 1, 2023

@ben No problem, really. Just convince more students to use "AI" generated text. Get that number up to 90%, problem solved!

Show thread

Danielle Navarro Feb 1, 2023

@ben Exactly correct, and they simply will not care. It's not surprising. I mean... the real world utility functions for educational users of the technology have to factor in the costs of investigating a false positive, and the consequences of an incorrect penalty applied to an honest student. But that's different to the utility function for OpenAI. They don't actually have to solve the problem in the wild, but they get the PR value of *appearing* to care about responsible AI. Sigh.

Show thread

Léonard Bertos 🫏Feb 1, 2023

@ben just AI correcting AI harm 🤣

Show thread

xmlns="Dan"Feb 1, 2023

@ben so this is the business model of AI? Throw molotovs around and then sell fire extinguishers?

Show thread

Channing Walton Feb 1, 2023

@ben we can put out this petrol fire with more petrol.

Show thread

Caoilte O'Connor Feb 1, 2023

@channingwalton @ben an AI detector of bad AI detectors should do the trick.

Show thread

Alan Buxton Feb 1, 2023

@ben I suspect there are good business reasons for OpenAI to make this. If they can reliably detect AI-generated text then they can exclude it from future training of GPT iterations. Otherwise future GPT iterations will become increasingly trained on AI-generated text.

Expect this early functionality to improve rapidly

Show thread

Shriram Krishnamurthi Feb 1, 2023

@ben Whole new reason to explain the base rate fallacy?

Show thread

Tamer Abdulradi Feb 1, 2023

I don't understand. If the detector is also available to students, what's stopping them from regenerating (or manually rewording) until the detector doesn't flag it anymore.

Show thread

Florian Egermann Feb 1, 2023

@ben to be fair, they do not recommend using it to make decisions.

Should they be forced to offer a solution for the problem they created? Yes, sure!

Show thread

Clinical Rawlsian Feb 1, 2023

@ben Sometimes I like to imagine what our educational system would look like if it took all the energy it spends on policing students and trying to catch cheating and plagiarism and put it toward designing new methods of instruction and assessment that embraced technology rather than spurning it.

Show thread

William Wagner Feb 2, 2023

@MadMadMadMadRN @ben Nah, we should just freak out. It's what we do best. 😉

Show thread

Count Regal Inkwell Feb 1, 2023

@ben Ok, I agree this is bad and such and the whole AI thing and its interaction with capitalism is eventually gonna be responsible for some serious shit.

... But can we talk about how fucking FUNNY it is that an AI corpo had to release an AI Detector to Detect AIs because their AIs are designed to pass as human and actually it turns out humans are better at distinguishing AI from humans than AI is?

Show thread

Rafal Feb 1, 2023

@ben This could be also used to help train GPT to appear more human, as @Mer__edith pointed out.

Not that it was their main goal, but a completely random happy coincidence.

Show thread

Brandon Rohrer Feb 1, 2023

@ben 👆
Also a solid illustration of false positives and false negatives having wildly different costs.

Show thread

Kingsley Uyi Idehen Feb 1, 2023

@ben,

It goes on, and on, and on...

Extreme reactions on both ends of the spectrum that always leaves unsuspecting innocents paying the price.

What happens to false positive victims?

SeeAlso, regarding the fundamental human problem:
https://people.well.com/user/doctorow/metacrap.htm#2.1

#Technology #Society

Metacrap

Show thread

Jigen Feb 1, 2023

@ben Agreed.

This is precisely the sort of thing AI is bad at too. This is definitely a move on OpenAI's part to try to dismantle public fear at their products for the sake of profit. It is a deeply unethical act.

Show thread

Prof Dr Richard S.J. Tol MAE Feb 1, 2023

@ben It's easy to tell ChatGPT from students. ChatGPT has faultless spelling and grammar, and a decidedly quirkless style.

Show thread

Alex Feb 1, 2023

@ben isn’t anything below 50% worse than just guessing?

Show thread

Ben Adida Feb 1, 2023

@alexthepres it's not *as* bad, but it's not a lot better.

Show thread

Matt Bauman

Feb 1, 2023

@alexthepres @ben They're using 5 categories ("very unlikely," "unlikely," "unclear if it is," "possibly" or "likely" AI).

Show thread

Kinky Kobolds Feb 1, 2023

@alexthepres For a binary yes/no function, you're generally correct. If your hit rate is worse than 50%, taking the inverse is, on average, better.

The difficulty is when false positives have a very large impact and you need to avoid those. In this case, false positives could mean serious consequences for a human writer, so limiting those is a reasonable compromise to make.

Now, whether or not it's reasonable to need an AI detector in the first place is an *entirely* different debate.

Show thread

Willem Janssen Feb 1, 2023

@ben @ionica

Show thread

Ryan J. Yoder Feb 1, 2023

@ben The fact that people can't do stats feels like a bad reason to recall a product quite frankly. Don't calculators suffer from the same problem...

Show thread

Nate Gaylinn Feb 1, 2023

@ben This is a fundamental problem in AI research. We need to make sure dangerous, half-baked tools see daylight so folks can understand, critique, and improve them. We also need to prevent those tools from being deployed on ral people before we understand them and can sufficiently mitigate the harms.

How in the world do we do that?

Show thread

Haere Feb 2, 2023

@ben can not trust these grifting freaks to honestly analyze or track the models they've spun up
(while generative large language models are interesting, OpenAI is an entirely malignant organization that should be stopped from monetizing any products)

Show thread

Mike Li Feb 2, 2023

@ben seems we can always come up a worse idea to cover up the previous bad idea

Show thread

lily Feb 2, 2023

@ben Just to show how bad this is - here: it isn't certain if the openai blogpost was written by a human.

I also submitted the whole blogpost which was flagged as human, but the limitations section on it's own - that is flagged as uncertain.

Show thread

frog Feb 3, 2023

@ben It's even worse. Sooner or later there will be more companies, and they will compete, and you'll have to pay for those that can cheat checkers, and students with rich parents can cheat better. My bet? We'll have to abandon the concept of homework! I just see no way out.

Show thread

Paul Allen Feb 3, 2023

@ben Subtitle: Our solution(j+1) to the novel problem(j) we created with our solution(j).

Show thread

msgbi Apr 5, 2023

@ben @charles_ex