Mastodawn

Viking_Hippie 1d ago

Slop detectives be like rule

https://lemmy.dbzer0.com/post/66240002

Show thread

Chloé 🥕

tools like these are used to reject CVs and grade school papers btw

no matter how much ai is trash do NOT use ai checkers, they do not work

Show thread

Viking_Hippie 1d ago

ESPECIALLY don’t use the “ai text humanizer” function of one that’s absolutely certain that RL authors were AI 🤦🏻

Show thread

canihasaccount 1d ago

Pangram does work, actually. Here’s independent validation by unaffiliated scientists:

www.nber.org/papers/w34223

Although white papers are biased, here’s pangram’s white paper:

arxiv.org/pdf/2402.14873

Artificial Writing and Automated Detection

Founded in 1920, the NBER is a private, non-profit, non-partisan organization dedicated to conducting economic research and to disseminating research findings among academics, public policy makers, and business professionals.

NBER

Show thread

errer 1d ago

Looked at the preprint. False positive rate of 0.2%, that’s crazy. I kinda find it hard to believe? It doesn’t seem possible to me.

Show thread

criss_cross 1d ago

That’s still 2 out of 1000 which if you’re using this at scale is not a great rate.

Would also be curious how that’s calculated if that’s done whit their test data that they’ve iterated on heavily or with actual feedback (which may never get back to them)

Show thread

brucethemoose 1d ago

I don’t buy it. Not until I can test it, hands on.

So many LLM papers have amazing (and replicated) results in testing, yet fall apart in the real world outside of the same lab tests everyone uses. Research is overfit to hell.

And that’s giving them the benefit on the doubt; assuming they didn’t train on the test set in one form or another. Like how Llama 4 technically aced LM Arena because they finetuned it to.

Show thread

qqq 1d ago

It looks like Pangram specifically holds back 4 million documents during training and has a corpus of “out of domain” documents that they test against that didn’t even have the same style as the testing data.

I’m surprised at how well it does; I really wonder what the model is picking out. I wonder if it’s somehow the same “uncanny valley” signal that we get from AI generated text sometimes.

Show thread

qqq 1d ago

Wow thanks for sharing this. I always thought these things were just complete BS but it seems like some actually do work

Show thread

Draconic NEO 5h ago

I’m with others here, I’m not buying it. This is too good to be true. The problem is way to messy to have any kind of easy solution and it’s made messier by the fact that AI companies and even individuals training AI models don’t want to make this easy, they actively work to make it as hard as possible.

Show thread

LillyPip 1d ago

Yep, they’re all trash and should not be relied upon.

I got anywhere from 35% to 70% AI generated results on a book I wrote in 2019, before AI was even released.

Show thread

coolman 1d ago

Seems like AI was trained on your book

Show thread

atopi 1d ago

before AI was even released \

GPT-1 was released in 2018 (though i dont think you need an AI checker to verify if something was made by it or not)

Show thread

LillyPip 22h ago

Was it? I was sure it was first released in 2022.

Show thread

smh 22h ago

Could have been. AI was trained on works written before AI was released.

Show thread

atopi 22h ago

in 2022, gpt 3.5, better known as chatgpt, got released

Show thread

Echo Dot 23h ago

I had to write a short story for English literature class in 2006 and I still have the file. Apparently over half of that is AI generated, which is pretty impressive on my part I must say.

Show thread

thevoidzero 1d ago

I witnessed an interaction where a grad school professor used AI detector and threatened to fail a student for submitting “AI generated” paper. It was so stupid, even after showing them how if you just add a few spelling mistakes the detection says human written, or even putting their own email in AI detector to show an example. It’s like the saying “little knowledge is dangerous”

Show thread

13igTyme 1d ago

This is the Dunning Kruger era.

Show thread

Echo Dot 22h ago

When I was at university I was pretty belligerent and if a professor tried that on me I’d have reported them for academic misconduct. They should be grading in the damn papers themselves, if they’re not going to do that then what is the point in them?

Show thread

Draconic NEO 5h ago

Yeah these are the kind of awful situations that will probably happen way more often as people turn to AI detectors to “find out” if someone is using AI not realizing that they aren’t completely accurate, or even remotely accurate.

Show thread

Buddahriffic 1d ago

Yeah, LLM-based checkers will still have LLM-based problems, most notably being incapable of true analysis, which is the whole point of an AI checker. It’s just the same text predictor shit.

Oh and also there’s an arms race where generative AI has the advantage because eventually it will be capable of generating things entirely indistinguishable from what a human would make (though it will still be susceptible to the hallucinations and errors it’s already famous for).

Show thread

Sylvartas 21h ago

That sucks. I had a hunch that my above-average level in french, my native language, (not just according to me, but also… almost all of my French teachers throughout my entire education) might be tripping these…