OpenAI's "was this text written by an AI" classifier is going to cause so many problems https://openai.com/blog/new-ai-classifier-for-indicating-ai-written-text/
New AI classifier for indicating AI-written text

We’re launching a classifier trained to distinguish between AI-written and human-written text.

The problem isn't the false negative rate -- saying "this text wasn't written by an AI" when it was -- but the false positive rate -- saying "this text was written by an AI" when it wasn't. And it's *9%* false positive on the claim "this was likely written by an AI"
The chance that someone integrates this into a plagiarism test and ends up failing students, or job applications, or reports, or anything else where there's some text where a false claim of plagiarism is going to be hard to prove but have severely negative consequences to the accused is going to be large, and cause a lot of problems before its worked out

The problem is that the tool fails the fundamental rule of "should I use AI to solve this problem".

AI is safe to use *only* if you meet the following criteria:
1) The output doesn't matter (e.g. writing a story for your kids is safe)
2) Someone qualified assesses the veracity of the output prior to using its decision in a way that could cause harm.

The problem is this tool will *always* fail these two tests (so long as its false positive rate is > 0%).

There are virtually no ways to use the tool that are objectively assessable -- there are no experts who can verify its output without external information that invalidates the point of the tool -- and virtually every case for knowing if text is written by an AI is about making a judgement decision regarding the author where failing it is harmful

It's worse than existing plagiarism tools, which say "this text is probably plagiarized from [this other text]", because there you can go and look at the other text and see if there's context that's missing. Like, maybe you're "plagiarizing" a properly-attributed quote, or the author of the other text is also you.

But here it's just "expensive magic computer brain says this student is a fraud", and administrators are going to assume it's true, and have few ways to independently validate it

The power of AI is its ability to *enhance* humans to make decisions, process information, and automate time/mentally consuming but generally pretty low-stakes tasks (like drafting an email).

But when you step past that into delegating life-changing decisions to an AI, either explicitly (loan company says no) or implicitly (magic box says 12% yes and human operator develops rule of thumb to press "go" if number > 10), you're going to run into a world of Kafkaesque garbage really quickly.

@Pwnallthethings @mmasnick Nobody asked, but after 35 years of doing this shit, all I can say is there is no such thing as AI unless the A stands for augmented, or perhaps automated. But Artificial? No. Everything is natural, if it happens here.

@Pwnallthethings

A personal view (à la early BBC2 series):

I have noticed a tendency of people to use naïve algorithms (algorithms that do not take into account the specifics of a problem) simply because it is ‘cool’ to use anything associated with the phrase ‘AI’.

This tendency has done great damage, in particular, to the graphical interfaces of Linux, BSDs, etc. (And it is already many years since I gave up trying to influence people away from this mistake.)

@Pwnallthethings

It is not that Microsoft and Apple platforms always do the right thing, or that ‘better written’ software on Linux, BSDs, etc., do not do the right thing.

My BIGGEST problem with it is that the ‘coolness’ of the ‘AI’ has MADE THE PEOPLE USING IT STUPID.

For ‘AI’ is what people tend to fall back upon, when they simply do not wish the bother of understanding the problem that is there—that EXISTS—that IS—to be solved.

And there’s perhaps the root of the problem.

@Pwnallthethings It’s not so much that an ‘AI’ generated object looks ‘plausible’—as that we have too many people who are more interested in the generation of ‘plausible’ things than in the generation of ‘correct’ things.

‘AI’ is something that SHOULD exist, but which SHOULD be used only when one CANNOT understand the problem, or there the problem is so complicated as to defy practical solution except by ‘naïve’ means.

This is what is distinctly NOT the case with the graphical xfaces I mention

@Pwnallthethings Postscript: The ‘graphical interface’ problem I refer to is a little software library called ‘fontconfig’. It uses very simple ‘AI’ to try to find a requested font, and can easily be made to completely fail at this simple job.
@Pwnallthethings Was going to add a thought here, but I'll just leave it to Carol Beer.
@Pwnallthethings I’m more worried about ‘ai says pull trigger’

@sleepy @Pwnallthethings This is my concern, as well. And the moral misapprehension that it because AI pulled the trigger (or said to), it is somehow culpable, rather than the humans who designed, trained, marketed, purchased, or deployed it (or, in your example, listened to it).

No machine makes a decision. A person (or people) makes the decision, every time; but these technologies allow them to obfuscate their role.

@Pwnallthethings if corporations / bureaucracies do this sort of thing already and are a form of analog AI, does that mean are we already in that world of Kafkaesque garbage and heading into an exponentially worse version of it?
Rights related to automated decision making including profiling

The GDPR has provisions on: automated individual decision-making (making a decision solely by automated means without any human involvement); and profiling (automated processing of personal data to evaluate certain things about an individual). Profiling can be part of an automated decision-making process.

@Pwnallthethings Don’t you think AI has the potential to be less biased than humans? The best part is it is easily testable too. Throw a bunch of test cases at it, in your loan example see the rates of loans for black vs white applicants, or even remove race from the system entirely. Humans have proven they’re extremely biased, just look at US history of redlining. As long as there is an audit trail and we have control of it I don’t see an issue.
@BoredPeter @Pwnallthethings AI trained on a corpus of human decisions will amplify the bias in those decisions. This is a Hard Problem.
@BoredPeter @Pwnallthethings
No. Ai is a creation of humans and will amplify their biases. Read Cathy o Neil’s weapons of math destruction for more on this
@Pwnallthethings I'm not sure there is much difference between an AI making something like a loan approval decision as opposed to a human operating within the constraints and incentive structure of a real life lending company. If anything, examining the AI for bad behavior and trying to fix it is probably easier.
@Pwnallthethings definitely. I think this is a problem already, no matter if with ML or not - if someone can make life changing decisions on you without clear rules or accountability, this is abusive power.
Plus the problem of capitalism - race to the bottom when it comes to "optimizing costs".
@Pwnallthethings For me it didn't take AI to create a sense of Kafkaesque garbage. Our bizarro members of the Grand Old Prevaricators managed that quite nicely.
ChatGPT detection tool says Macbeth was generated by AI. What happens now?

It turns out that when it comes to detecting generative AI — including ChatGPT — there may be no quick fix.

VentureBeat

@Pwnallthethings
The frustrating thing is, this is not new

There is a 2002 article on this by Sidney Dekker
https://hachyderm.io/@mononcqc/109804346556988230

Reminds me of the saying
The problem with common sense is that it is not so common 😡

Fred Hebert (@[email protected])

Attached: 1 image This week I decided to revisit Sidney Dekker's #paper titled "MABA-MABA or Abracadabra? Progress on Human–Automation Co-ordination", which discusses something called "the substitution myth", a misguided attempt at replacing human weaknesses with automation. Instead, the suggestion is to focus on cooperation and team work, rather than substitution: https://www.researchgate.net/publication/226605532_MABA-MABA_or_abracadabra_Progress_on_human-automation_co-ordination My notes are at: https://cohost.org/mononcqc/post/960352-paper-maba-maba-or #LearningFromIncidents #HumanFactors

Hachyderm.io

@Pwnallthethings
Or even older.
allegedly from a 1979 IBM training slide

(I couldn't verify the origin
The closest I got was "The computer as an advisor, not a decision-maker" by IBM Fellow John Cohn https://www.ibm.com/blogs/think/be-en/2013/11/25/the-computer-as-an-advisor-not-a-decision-maker-the-vision-of-ibm-fellow-john-cohn/ )

The computer as an advisor, not a decision-maker – the vision of IBM Fellow John Cohn - Belgium

It was over 50 years ago that Thomas Watson Jr. launched the IBM Fellows program, the highest honor that can be awarded to an IBM technical staffer. The accolades are handed out for exceptional contributions to scientific, technical or social initiatives that have helped create a smarter planet. IBM Fellows have been behind some of […]

Belgium

@Pwnallthethings

From past experience:

Administrators are going to ignore reports of cheating if the perps are "good" students, but are going to dismiss exculpatory evidence from "bad" students.

"Bad", in the context of academic administrative bias, often means "poor" or "weird." Academia is harsh to students who don't look the part.

@Pwnallthethings The idea that an AI must be able to justify its decisions in court - eg if refusing someone a financial transaction - is already floating around. The technology might not exist yet, but there are certainly people working on it.
@Pwnallthethings There's one legit use case I know of. A user-generated content platform wants readers to know if they're reading generated content, so they ask accounts posting generated content to flag it and automatically flag accounts that don't. Testing multiple items solves the false positive issue & it's pretty low stakes anyways.
@Pwnallthethings From a Software Engineering perspective, I keep thinking using AI would be like hiring on a Jr Developer who doesn't care one bit what you tell them in the performance reviews.
@mquirion this is painfully familiar.
@mquirion @Pwnallthethings basically, like hiring the child of one of the owners, having them report to someone with lesser authority than the owner, and expecting everything to be "perfectly fine".

@mquirion @Pwnallthethings

As I say in my own comments, ‘AI’ is what you use when either (a) how to solve the problem CORRECTLY is unknown, or (b) solving the problem CORRECTLY is impracticable.

I use as example of something that fails on both points the use of a naïve pattern matcher (a simple form of ‘AI’) to select fonts in fontconfig. It is a wonderful example of people doing the wrong thing with ‘AI’ (presumably simply because it is ‘cool’).

@mquirion

(I gave up YEARS ago trying to get the freedesktop people to launch fontconfig into the ocean with a trebuchet, as they ought to have done. Selecting the correct font, given the criteria fontconfig takes as input, is a problem whose solution is thoroughly described by the OpenType standard. Fontconfig ignores all of that and instead uses primitive ‘AI’.)

@mquirion @Pwnallthethings funny I had the thought today about it being the greatest Marketing Assistant ever, who doesn't give a fuck about their job

@Pwnallthethings Yeah, and there are many other subtleties along those lines as well. For example, does the FP rate vary over demographics?

MS uses a pretty extensive list when considering AI applications: https://query.prod.cms.rt.microsoft.com/cms/api/am/binary/RE5cmFl

@Pwnallthethings I don't think we can be complacent about 1) because these tools will be powerful manipulators if humans. Ideas, prejudices, likes, dislikes can all be influenced, nee implanted. Like the subliminal advertising that never really was.

Would you really feel ok trusting an Amazon AI to invent stories for your kids?

@Pwnallthethings
I amused myself by asking ChatGPT to write articles about subjects I'm an expert in. I gave the bot truthful prompts, and watched it produce texts which were 90% true with 10% howlers very authoritatively woven in.

When I pointed out the error, ChatGPT apologized and replaced it with something equally false.

But there's no denying it sounded really believable.

@Je5usaurus_rex Yes, exactly. It's *fantastic* at bullshitting. Which, if you're an expert in the topic stands out, and if you're not, can be very convincing and very wrong.
@Pwnallthethings @Je5usaurus_rex makes me want to know who they trained it on 🤔

@Pwnallthethings A lot of this would go by the wayside if we simply DEËMPHASIZED RATING of individual performances.

It is clearly stupid, for instance, to rate students if this encourages plagiarism. They should want to learn and be encouraged to learn.

(This is why I subtracted two stars from my rating of the Duolingo app, relative to the highest rating I have given it. They started punishing students for making mistakes—instead of doing what should be done—ENCOURAGING them to make mistakes!)

@Pwnallthethings not to amnesty OpenAI of this particular monster from their Pandora’s Box, but weren’t we already here once they released the kraken called GPT?
@Pwnallthethings @adrienne Yep, utterly predictable, and the people responsible just clearly don’t care. 😡