I've been experimenting with GPTZero, which claims to be able to identify whether text was produced by an LLM or a human.

Everything I know about how LLMs work and how humans produce language (I know a non-trivial amount about both, given my background in computational linguistics and psycholinguistics) tells me that you will never, ever be able to build something that can reliably distinguish between the two. And sure enough, GPTZero fails miserably. So many false positives.

My brain wouldn't let go of this, so I now have a nearly 5k-word essayish thing expanding on this thread, and I don't know what to do with it - do I post it to Medium? Do I rework it to submit to a journal, and if so, which one? Make it an OER somehow? I dunno, but if anyone wants to read almost 5k words about why AI-detection is a fool's errand likely to disproportionately harm marginalized students, here you go: https://docs.google.com/document/d/1CXgsYD-Sk-FqrbiaK2XeRwgRK56RG7-1rX2DDMV0JAA/edit?usp=sharing
Against the use of GPTZero and other LLM-output detection tools

Against the use of GPTZero and other LLM-output detection tools These last few months have been a VERY interesting time to be me: a person whose background is in computational linguistics, cognitive science, and psycholinguistics who ended up becoming a writing professor. The advent of ChatGPT ...

Google Docs

@writerethink I don't understand how it's not obvious that an LLM which learnt by copying texts & a student who learnt by paraphrasing texts are going to turn out text that contains similarities.

Pointing at *good* writing and saying "this is good" incentivises both ChatGPT and students to lean into that style. It seems like a no-brainer that the OG source, ChatGPT's version and a student can all be flagged "AI-generated" when two of them are trying to sound as good as the first.