Mastodawn

Carl T. Bergstrom Jun 28, 2023

ChatGPT detection and algorithmic bias:

This afternoon James Zou directed me to a recent pilot study from his group in which they looked at the performance of seven different GPT-detectors that are sometimes used to flag cheating in educational settings.

They found that these detectors commonly misclassify text from non-native English speakers as being written by an AI. A primary driver appears to be the lower perplexity (exponent of model's loss) of such text.

https://arxiv.org/abs/2304.02819

GPT detectors are biased against non-native English writers

The rapid adoption of generative language models has brought about substantial advancements in digital communication, while simultaneously raising concerns regarding the potential misuse of AI-generated content. Although numerous detection methods have been proposed to differentiate between AI and human-generated content, the fairness and robustness of these detectors remain underexplored. In this study, we evaluate the performance of several widely-used GPT detectors using writing samples from native and non-native English writers. Our findings reveal that these detectors consistently misclassify non-native English writing samples as AI-generated, whereas native writing samples are accurately identified. Furthermore, we demonstrate that simple prompting strategies can not only mitigate this bias but also effectively bypass GPT detectors, suggesting that GPT detectors may unintentionally penalize writers with constrained linguistic expressions. Our results call for a broader conversation about the ethical implications of deploying ChatGPT content detectors and caution against their use in evaluative or educational settings, particularly when they may inadvertently penalize or exclude non-native English speakers from the global discourse. The published version of this study can be accessed at: www.cell.com/patterns/fulltext/S2666-3899(23)00130-7

arXiv.org

Show thread

Carl T. Bergstrom Jun 28, 2023

Ironically, these false positives are readily avoided by asking ChatGPT to rewrite the non-native English speaker's text to increase linguistic complexity.

In other words, the way for these speakers to avoid being accused to cheating is to actually cheat.

The take-home for higher ed is obvious and stark. Many (all?) current ChatGPT detectors have not been adequately assessed for issues of algorithmic bias and therefore should not be used to accuse students of misconduct in their written work.

Show thread

Jan Spakula

@ct_bergstrom This is in maths ecosystem, but the same conclusion: detectors are cr*p, will become more cr*p, LLMs will become much more embedded (eg in Word, etc), disadvantage non-native English speakers. One suggested way out is to _embed_ LLMs into teaching rather than try to ban them. (Much like calculators.)
https://cesaregardito.substack.com/

Thoughts | Cesare G. Ardito | Substack

where i put things that are too long for a Twitter thread. Click to read Thoughts, by Cesare G. Ardito, a Substack publication. Launched 4 months ago.