We have spotted quite a few students using generative AI in their essays this summer and applied standard academic misconduct proceedings, though in most cases the work was so bad they would've failed anyway.

Today I learned of one whose use was sufficiently extensive that they will fail their degree.

I am wondering if this is *the first time a student has failed a whole degree for using AI*? Would love to hear about other cases. If you want to tell me in confidence, my Session ID is in my Bio

@tomstoneham @mhoye prediction: once the initial rush of interest is over, serial cheaters are gonna go back to online services that outsource essay-writing to humans in developing English-speaking countries. _They_ might use GPT as an automation tool but the value-add will still be in “make this seem plausibly the output of a human who understood the material.”
@tomstoneham I mean - fundamentally, are they failing for using AI? Or are they failing because they haven't demonstrated the competencies or learning objectives that their essays/portfolio were giving them a chance to demonstrate?

@kdnyhan

This is a really good question. They are failing for not meeting the academic standards we clearly lay out and spend a lot of time teaching them about attribution of sources, whether quoted or paraphrased, not passing of the work of others as your own etc.

That it was done with AI rather than cut and paste from Wikipedia makes no difference to that (though the AI does some weird stuff making up references!)

@kdnyhan
We also give them written and oral opportunities to explain why the text they produced has the appearance of being plagiarism.
@kdnyhan @tomstoneham yes! Exactly! This is not (mostly) an AI issue. It is an academic integrity and (sometimes) assessment design issue.
@tomstoneham How are you detecting the LLM use? I've seen references that the tools used to detect them are not entirely reliable (maybe 30% false positive?). Of course, bad writing is still bad writing, as you said

@mikeg
By reading what they produced!

When you read >100,000 words of student produced work in your specialist area every year, you can spot something fishy.

We also interview before making a final decision and when they cannot even answer the most basic questions on what they have written ...

It is no different from detecting any other form of plagiarism.

@mikeg @tomstoneham the detectors are terrible! But much like other plagiarism cases, there is often a disconnect between paragraphs or abrupt changes of tone and missed references.

@jnyrose @mikeg
Experienced academics are much more reliable than tech solutions.

We are highly trained pattern detectors for 'not written by a student' 😆

@tomstoneham @mikeg right?! And in the end, you can always ask them to explain their thinking. I imagine that would trip up those relying on AI for their thinking a bit.
@tomstoneham How can you be so sure of your detection? Machine learning can also incorrectly characterize submissions. Where is the openness and understanding for new technologies? If rote knowledge is really so important; why not move to in-class essays and oral exams? It sure seems like the academic freak out over new technologies is more of an indictment of inflexible educational policy than students violating an ancient honor code.
@tomstoneham I honestly wonder why we don’t see more real adaptions to change, instead of complaints and often inaccurate enforcement. Generative AI is here to stay; we really need better responses by Academica.

@awaterma
You have made a lot of assumptions there!!

We don't give credit for rote knowledge. Our marking criteria only mention understanding of the material taught, argumentation, structure, writing and referencing.

The tells are (1) not drawing on the material taught but other sources, (2) making up sources, (3) coherence over 1000s of words etc. (4) and a writing style of a level higher than the student produces in other work.

We always have an oral to check before imposing a fail mark.

@awaterma
And for what it is worth, I am far from 'freaking out' - we have discussed at length and are happy with the idea of using AI generated text as a basis, so long as it is then edited in such a way that it meets our academic standards.

This student failed because they cut and pasted it rather than used it as a source in an appropriate way.

@tomstoneham @awaterma

From the context of this thread it ultimately sounds like the students are taking the output as-is with no attempt to engage with the source material. A generated piece of writing should be penalized, regardless of if by a writing service or an AI. The point of education is to take in knowledge so you may benefit from it and not to just let machines/hired writers do everything.

@DukeCarge @tomstoneham maybe so, but how do you really know it’s generated? And how will people work with text generation in the future — it’s going to be important. If rote learning is really so necessary — test kids in ways they can’t “cheat.”

@awaterma @DukeCarge

I am puzzled ... who said rote learning was of any value at all in any sane educational system?

As to how do we know? Well, we know within the standards of proof required by our processes for detecting academic misconduct.

You might ask: how do we 'really know' anyone convicted of a crime was guilty? Balance of probabilities? Beyond reasonable doubt? These are the relevant concepts to be applying, depending upon the legal or procedural context.

@tomstoneham @awaterma
I don't use rote learning in my job at all now that you mention it. Academia readied me for how to take critical texts to the theory of what I am to apply as an engineer and how to make sense of the fundamentals to the systems that I use.

I couldn't tell you off the top of my head what is the effect of adding new pieces to a system without looking up the fundamentals again.

@DukeCarge @tomstoneham @awaterma yes! Plus, there's no real value to rote learning anymore anyway. I tell my junior mentees not to bother to intentionally memorize any fact. If you use it often enough to be worth memorizing, your brain is going to do it anyway without trying. What matters is the how and why. Learn those and you can put together any idea.

That's why the discussion of accuracy of "AI detectors" is silly. "Questioned student doesn't understand at level of essay" is very reliable.

@ATurnOfTheNut @DukeCarge @tomstoneham sure, but the trouble is with Universities and High Schools using inaccurate models to “find” essays created by A.I. That’s the inconvenient fact elided here. Even OpenAi’s latest classifier to detect “ai generated content” is only seeing “success rates” of 26% with 9% false positives. https://openai.com/blog/new-ai-classifier-for-indicating-ai-written-text
New AI classifier for indicating AI-written text

We’re launching a classifier trained to distinguish between AI-written and human-written text.

@awaterma @ATurnOfTheNut @tomstoneham
So you're agreeing that the manual methods by which students are caught not knowing the contents of their essay are effective.
No one in this thread has quoted use of these models but you.
@awaterma @ATurnOfTheNut @tomstoneham societal problems of academic dishonesty caused by new technology cannot be solved by tech.
Cheating always has and always will exist. It is a cultural issue that must be solved by educators and not by throwing more obscure and abstract technology stacks at them.
@DukeCarge @ATurnOfTheNut @tomstoneham definitely! Educators have a long list of existing tools to better check for this! I especially hope there’s a greater focus on in-class essays and oral exams — if rote knowledge is what needs to be measured. It’s those that aren’t interested in that type of hard work that scare me. https://www.washingtonpost.com/technology/2023/05/18/texas-professor-threatened-fail-class-chatgpt-cheating/
A professor accused his class of using ChatGPT, putting diplomas in jeopardy

A Texas A&M instructor falsely accused students of using ChatGPT to write essays, putting them at risk of failing.

The Washington Post
@awaterma @ATurnOfTheNut @tomstoneham why do you keep going back to rote. Everyone has already dismissed it.
You're speaking on a completely different subject and topic at this point.
@DukeCarge @ATurnOfTheNut @tomstoneham Feel free to start your own thread; I’m just replying after enjoying the holiday.
@awaterma @ATurnOfTheNut @tomstoneham no, my dude. That's your subject you want to talk about. Your points have been dismissed by the 3/4 people in this thread. No one wants to discuss these detection models with you here. I'm going back to work. Cheers.

@tomstoneham technologist turned graduate law+policy student here, can confirm.

we have weekly short essays to post online for class discussion; the word soup in a few recent posts is 🤦🏻‍♀️. you really don't need an edtech surveillance tool to see it. can only imagine their full papers...

that said, not sure it's only those who would turn in failing work anyway. i think some students -- like general public -- probably use it like a search engine.

frustrating to see, but not unexpected.

@tomstoneham
For the student who failed their degree, was this an associates or some sort of 1-2yr certification? Doesn’t seem like such tools have been broadly available long enough to fabricate a 4yr degree. Or was the use so egregious they were booted from the program? Or maybe they were booted from a masters program, that would fit the timeline.
@josh
In the UK students have to pass each year of their degree before progressing to the next or graduating. There is a maximum number of fail marks they can carry in any year. This student hit that number.
@tomstoneham
Ouch. So quite pervasive with their cheating, eh? It almost seems like it would be worth offering them some redemption if they would submit to an interview regarding why they felt they could get away with it. Years ago I had a student who in desperation kept escalating the amount of plagiarism from web sources in their papers until it was unavoidably noticeable. Seems like there might be a shared mindset.

@josh
They were interviewed and gave a written response to the evidence presented. (We do take care in these matters!)

Denied it in general but couldn't say anything about specific points of evidence or explain the material in the essay.

@tomstoneham
Oh, I’m sorry if it felt implied that the process was careless. Not at all my intent. I just meant to gather information on misuse of an emergent tool for preventative measures. It seems that it would be hard to argue that they were ignorant of the impropriety, but the rationale which carried them through the violation might be illuminating

@josh @tomstoneham Not exactly on topic, but this might interest you, https://social-epistemology.com/2023/03/29/24-philosophy-professors-react-to-chatgpts-arrival-part-i-ahmed-bouzid/.

A two part interview of two dozen American(?) philosophy professors about ChatGPT's arrival. Some serious caveats and considerations, but also a hint of maybe reckless techno-optimism. Outsourcing invaluable parts of deep learning "because we can" worries me in particular.

24 Philosophy Professors React to ChatGPT’s Arrival, Part I, Ahmed Bouzid

For someone like myself who makes their living in the field of Human Language Technology, two dates from the past decade or so have stood as watershed moments in that field: October 4, 2011 when Apple’…

Social Epistemology Review and Reply Collective
@josh
No offence taken. We are all in this together.
@tomstoneham I understand expulsion, but I don't understand what "fail a degree" means. Did the investigation go back and review previous courses and revoke grades?
@WordyAnchorite
The student ended up with more fail marks than they were allowed to have and still graduate.
@tomstoneham Thanks for the clarification
@tomstoneham @Dan_Blick my son is at a technical school, and described another student using chatgpt (on their phone, on their lap) during an in-person final exam. that student's last exam of a 2-year program. They were caught cheating, and wound up getting booted out of the program. Turns out, they'd been doing it since ChatGPT showed up, so a whole semester's exams and assignments. they wasted 2 years of work to save some time studying.

@tomstoneham
I had great luck turning this problem into a strength: I asked students to generate a bit of their paper using an AI, then critique its output. Details:
https://hachyderm.io/@inthehands/109479808455388578

Among other things, implicitly saying “the AI’s output is pretty much guaranteed to suck” sent a useful message about using it to cheat.

Paul Cantrell (@[email protected])

Attached: 3 images OK, trying an experiment with my Programming Languages class! • Have an AI generate some of your writing assignment. • Critique its output. Call BS on its BS. Assignment details in screenshots below. I’ll let you know how it goes. (Here are the links from the screenshots:) Raw AI Text: https://gist.github.com/pcantrell/7b68ce7c5b2e329543e2dadd6853be21 Comments on AI Text: https://gist.github.com/pcantrell/d51bc2d4257027a6b4c64c9010d42c32 (Better) Human Text https://gist.github.com/pcantrell/f363734336e6063f61e451e2658b50a6 #ai #chatgpt #education #writing #highered #swift #proglang

Hachyderm.io

@inthehands
We thought about doing that but didn't get time.

Personally I am happy for AI to be used as a tool (we don't ban spellcheckers and autocorrect or even proofreaders) so long as the student is taking the final decision about what goes in and what does not. Editorial responsibilit, as it were.

That is how law firms like Allen & Overy use it.

Of course, in many disciplines it is currently a pretty rubbish tool!

@tomstoneham I haven't heard of a case (I'm in physics, where essay-writing is not a big part of the curriculum), but I would be very interested to learn of other examples.

@tomstoneham @inthehands

LLMs will 100% be used. “Bans” will mean only some kids get caught and some kids get away with it, and others don’t get the benefit of using it. Probably better to adapt — how can LLMs make writing better. Where to go from that 0th order draft.

If you really want something written BY the student & fair to other students, why not have ‘handwritten’ (typed on a school machine), while students are supervised?

@tomstoneham I've not heard of anybody failing a degree for cheating via LLM, but I've certainly seen a huge increase in the use of LLMs to cheat. I see some of the other commenters are wondering how you could tell ... well in one case I encountered, a student's essay literally included the quote "in my experience as an artificially-intelligent chatbot..."

It seems as though many cheaters don't bother to proofread 😆

@tomstoneham how did you detect the generative AI usage?

@andrei_chiffa
By reading it. Most student cheating is obvious to a specialist who knows what they have (and have not) been taught and has read millions of words of student attempts to write about it.

But there is a more fundamental point I need to write up in detail:

University teaching is basically a form of intensively supervised reinforcement learning on a carefully curated, small data set aiming to produce a specific capability.

It is obvious the AI didn't go to class *here*

@tomstoneham

So it's not as much generative AI detection as it is detection of the fact the student did not attend or even familiarise themselves with the class, correct?

@andrei_chiffa
Yes. The AI is best explanation of some of the details which were produced. Process included interview with student.

@andrei_chiffa
It is easy to overlook how well trained on a large dataset academics are. We each read around at least a million words of student work on our specialist area every year. Some of us have been doing that for decades.

Our pattern recognition for 'not produced by a student' is pretty good 😆

@tomstoneham @andrei_chiffa
hu thats intresting ...
when I studied ( political science ) of course we had courses with preselected textes etc ... but in the work we where expected to go further then that. So that at the end what we talked and read about in class was only a tiny amount of the work I would present

Its intresting that none of you talks about the increasing workload studends have to fullfill. Espacially with inflation where most students i know work 2 jobs or more.
how does this effect the decision to use tools like chat gpt?

@generic @andrei_chiffa
Your two points are related. We don't penalise going *beyond* the set texts but we do require that the essay demonstrates understanding of what was actually taught.

We don't *require* going beyond the set texts because that rewards those who don't have jobs or caring responsibilities.

@tomstoneham I'm not an academic myself, but there are several in my life, & every single one of them has had to deal with AI cheating this year. I was talking with one guy last night who said that his department is making plans to shift its assessment back toward in person exams, because AI is such a big problem.

@gibbondemon
That would be really sad. In person exams are known to be biassed towards socially privileged and neurotypical students.

Making that regressive shift would mean the white, male, neurotypical bias of AI had won even when it wasn't being used!

@tomstoneham Absolutely. As in many other areas, I think that AI is exaggerating flaws in existing systems - in the case of educational assessment, the fact that our assessment tools don't actually measure what we want to measure, just an approximation of it. But because people are in a panic, and dealing with the real problems would take a lot of time and work, we'll see knee jerk reactions instead.

@tomstoneham I’m really curious about what are acceptable and unacceptable uses of generative AI for students.

My most productive professional use is to improve the readability of analysis I’ve produced. I also use it to help spot patterns in data (and trivial stuff like reformatting data structures).

Would I get failed for these as a student? Not looking for an argument, just to be clear: I’m really interested in how academia’s facing the challenges introduced by this technology.

@tom Plenty of acceptable uses and we discussed this at department meeting before any marking.

Uses of AI which also constitute good old fashioned plagiarism are the problem here.