Slop detectives be like rule
Slop detectives be like rule
tools like these are used to reject CVs and grade school papers btw
no matter how much ai is trash do NOT use ai checkers, they do not work
Pangram does work, actually. Here’s independent validation by unaffiliated scientists:
Although white papers are biased, here’s pangram’s white paper:
That’s still 2 out of 1000 which if you’re using this at scale is not a great rate.
Would also be curious how that’s calculated if that’s done whit their test data that they’ve iterated on heavily or with actual feedback (which may never get back to them)
I don’t buy it. Not until I can test it, hands on.
So many LLM papers have amazing (and replicated) results in testing, yet fall apart in the real world outside of the same lab tests everyone uses. Research is overfit to hell.
And that’s giving them the benefit on the doubt; assuming they didn’t train on the test set in one form or another. Like how Llama 4 technically aced LM Arena because they finetuned it to.
It looks like Pangram specifically holds back 4 million documents during training and has a corpus of “out of domain” documents that they test against that didn’t even have the same style as the testing data.
I’m surprised at how well it does; I really wonder what the model is picking out. I wonder if it’s somehow the same “uncanny valley” signal that we get from AI generated text sometimes.
Yep, they’re all trash and should not be relied upon.
I got anywhere from 35% to 70% AI generated results on a book I wrote in 2019, before AI was even released.
before AI was even released \
GPT-1 was released in 2018 (though i dont think you need an AI checker to verify if something was made by it or not)
Yeah, LLM-based checkers will still have LLM-based problems, most notably being incapable of true analysis, which is the whole point of an AI checker. It’s just the same text predictor shit.
Oh and also there’s an arms race where generative AI has the advantage because eventually it will be capable of generating things entirely indistinguishable from what a human would make (though it will still be susceptible to the hallucinations and errors it’s already famous for).