My students are often surprised to learn that LLMs aren’t answering their questions. Rather, an LLM answers the question “what would a reply to this look like?” It’s one of the first things I explain in the “Should I use LLMs?” portion of my syllabus.
Welp, posted this late last night then logged in and found it’d been around the world and back.
For everyone who asked, here’s the full section from the syllabus.
And here’s the study linked in the pdf, though there are others.
https://arxiv.org/abs/2506.08872
Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task

This study explores the neural and behavioral consequences of LLM-assisted essay writing. Participants were divided into three groups: LLM, Search Engine, and Brain-only (no tools). Each completed three sessions under the same condition. In a fourth session, LLM users were reassigned to Brain-only group (LLM-to-Brain), and Brain-only users were reassigned to LLM condition (Brain-to-LLM). A total of 54 participants took part in Sessions 1-3, with 18 completing session 4. We used electroencephalography (EEG) to assess cognitive load during essay writing, and analyzed essays using NLP, as well as scoring essays with the help from human teachers and an AI judge. Across groups, NERs, n-gram patterns, and topic ontology showed within-group homogeneity. EEG revealed significant differences in brain connectivity: Brain-only participants exhibited the strongest, most distributed networks; Search Engine users showed moderate engagement; and LLM users displayed the weakest connectivity. Cognitive activity scaled down in relation to external tool use. In session 4, LLM-to-Brain participants showed reduced alpha and beta connectivity, indicating under-engagement. Brain-to-LLM users exhibited higher memory recall and activation of occipito-parietal and prefrontal areas, similar to Search Engine users. Self-reported ownership of essays was the lowest in the LLM group and the highest in the Brain-only group. LLM users also struggled to accurately quote their own work. While LLMs offer immediate convenience, our findings highlight potential cognitive costs. Over four months, LLM users consistently underperformed at neural, linguistic, and behavioral levels. These results raise concerns about the long-term educational implications of LLM reliance and underscore the need for deeper inquiry into AI's role in learning.

arXiv.org
@mcnees indeed, the crucial question is: Why are you here?
@mcnees in the last year I saw so many student submissions done using LLMs. Could you provide the link to the study cited in the last paragraph?
Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task

This study explores the neural and behavioral consequences of LLM-assisted essay writing. Participants were divided into three groups: LLM, Search Engine, and Brain-only (no tools). Each completed three sessions under the same condition. In a fourth session, LLM users were reassigned to Brain-only group (LLM-to-Brain), and Brain-only users were reassigned to LLM condition (Brain-to-LLM). A total of 54 participants took part in Sessions 1-3, with 18 completing session 4. We used electroencephalography (EEG) to assess cognitive load during essay writing, and analyzed essays using NLP, as well as scoring essays with the help from human teachers and an AI judge. Across groups, NERs, n-gram patterns, and topic ontology showed within-group homogeneity. EEG revealed significant differences in brain connectivity: Brain-only participants exhibited the strongest, most distributed networks; Search Engine users showed moderate engagement; and LLM users displayed the weakest connectivity. Cognitive activity scaled down in relation to external tool use. In session 4, LLM-to-Brain participants showed reduced alpha and beta connectivity, indicating under-engagement. Brain-to-LLM users exhibited higher memory recall and activation of occipito-parietal and prefrontal areas, similar to Search Engine users. Self-reported ownership of essays was the lowest in the LLM group and the highest in the Brain-only group. LLM users also struggled to accurately quote their own work. While LLMs offer immediate convenience, our findings highlight potential cognitive costs. Over four months, LLM users consistently underperformed at neural, linguistic, and behavioral levels. These results raise concerns about the long-term educational implications of LLM reliance and underscore the need for deeper inquiry into AI's role in learning.

arXiv.org
@k0nze @mcnees Well there's this, though still in ArXiv https://arxiv.org/abs/2506.08872
Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task

This study explores the neural and behavioral consequences of LLM-assisted essay writing. Participants were divided into three groups: LLM, Search Engine, and Brain-only (no tools). Each completed three sessions under the same condition. In a fourth session, LLM users were reassigned to Brain-only group (LLM-to-Brain), and Brain-only users were reassigned to LLM condition (Brain-to-LLM). A total of 54 participants took part in Sessions 1-3, with 18 completing session 4. We used electroencephalography (EEG) to assess cognitive load during essay writing, and analyzed essays using NLP, as well as scoring essays with the help from human teachers and an AI judge. Across groups, NERs, n-gram patterns, and topic ontology showed within-group homogeneity. EEG revealed significant differences in brain connectivity: Brain-only participants exhibited the strongest, most distributed networks; Search Engine users showed moderate engagement; and LLM users displayed the weakest connectivity. Cognitive activity scaled down in relation to external tool use. In session 4, LLM-to-Brain participants showed reduced alpha and beta connectivity, indicating under-engagement. Brain-to-LLM users exhibited higher memory recall and activation of occipito-parietal and prefrontal areas, similar to Search Engine users. Self-reported ownership of essays was the lowest in the LLM group and the highest in the Brain-only group. LLM users also struggled to accurately quote their own work. While LLMs offer immediate convenience, our findings highlight potential cognitive costs. Over four months, LLM users consistently underperformed at neural, linguistic, and behavioral levels. These results raise concerns about the long-term educational implications of LLM reliance and underscore the need for deeper inquiry into AI's role in learning.

arXiv.org
Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task

This study explores the neural and behavioral consequences of LLM-assisted essay writing. Participants were divided into three groups: LLM, Search Engine, and Brain-only (no tools). Each completed three sessions under the same condition. In a fourth session, LLM users were reassigned to Brain-only group (LLM-to-Brain), and Brain-only users were reassigned to LLM condition (Brain-to-LLM). A total of 54 participants took part in Sessions 1-3, with 18 completing session 4. We used electroencephalography (EEG) to assess cognitive load during essay writing, and analyzed essays using NLP, as well as scoring essays with the help from human teachers and an AI judge. Across groups, NERs, n-gram patterns, and topic ontology showed within-group homogeneity. EEG revealed significant differences in brain connectivity: Brain-only participants exhibited the strongest, most distributed networks; Search Engine users showed moderate engagement; and LLM users displayed the weakest connectivity. Cognitive activity scaled down in relation to external tool use. In session 4, LLM-to-Brain participants showed reduced alpha and beta connectivity, indicating under-engagement. Brain-to-LLM users exhibited higher memory recall and activation of occipito-parietal and prefrontal areas, similar to Search Engine users. Self-reported ownership of essays was the lowest in the LLM group and the highest in the Brain-only group. LLM users also struggled to accurately quote their own work. While LLMs offer immediate convenience, our findings highlight potential cognitive costs. Over four months, LLM users consistently underperformed at neural, linguistic, and behavioral levels. These results raise concerns about the long-term educational implications of LLM reliance and underscore the need for deeper inquiry into AI's role in learning.

arXiv.org
@Robert McNees everybody should try to ask some LLM about topics you know really well. and then go through the LLM's answers and check them. my personal experience (repeatedly) is: the answers are always wrong. wrong by omission of facts or wrong by hallucinations.

you cannot learn from these answers. LLM are not wikipedia, not even remotely. you have to check the LLM's answers and therefore you should know everything about the relevant topics beforehand. and then you wouldn't need LLM as information source.
@jabgoe2089 @mcnees
And the LLMs answers not only are not wikipedia, etc., but they also are stripped of context since the sources are opaque.
There is no way to critically evaluate the information source (the way one used to be able to evaluate internet sources from a google search).
^^ this point is from "The AI Con" by @emilymbender & Alex Hanna

@KarenCampe @jabgoe2089 @mcnees @emilymbender

These observations are why I always add, "include citations" so I can go check them.

@mcnees In the good old tradition of training LLM, are you letting us steal this? It is very well stated after all.
@majeriisli Please feel free to use it yourself, as long as you allow others to do the same with something you produce using it.
@majeriisli @mcnees The good old tradition of LLMs is not so gracious as to ask for consent 🥲 The tradition is to violate consent, copyright, and to fling by the wayside any concern for the author, the server, or the environment.
@mcnees I find them useful for jogging my memory on stuff that I should have remembered. The emphasis here is on I already learned it well the old fashioned way.
@mcnees Using MT (machine translation) also negativly impacts work. I'm actually faster when translating from scratch with less errors than a "pretranslated" piece I need to review. I really hate these so-called selfthinking machines

@mcnees Thanks for writing this. I typed out the rest of it to fix the truncated alt text.

PS: As a consequence of typing it out by hand, I also noticed a little error 🙂 “One recent study find” should probably be “One recent study found” or “One recent study finds”.

Should I Use ChatGPT Or Another LLM To Study?

I wouldn’t recommend it. I try to keep up with the capabilities of the major LLMs. They can do some things really well, if you use them the right way. However, they frequently make mistakes when generating responses to questions about physics. Sometimes these mistakes are obvious, sometimes they are subtle and hard to spot. The fact that you cannot trust the output of LLMs should be reason enough not to rely on these systems when you are trying to learn a new subject.

But that’s not the only problem. Interactions with LLMs feel like a dialog, so it’s natural to think the usual rules of conversation apply. You ask a question and expect the response will be an answer to that question. It’s important to understand that this is not what’s happening. An LLM is designed to generate statistically likely responses to the question “What would an answer to this query sound like?” This is not the same thing as answering the question. It might produce what you are looking for, or it might not. This is one reason why output from an LLM will sound authoritative even when it’s wrong, and apologetic when mistakes are pointed out. It isn’t authoritative or apologetic, and it isn’t “thinking” about your question. These are just the sorts of responses that best fit a very complicated set of likelihood criteria.

A bigger problem is that using an LLM short-circuits the process of thinking through questions and developing strategies to answer them. It’s not that an LLM never gets things right; they often produce correct output. But correct outputs are limited to materials in the model’s training data — questions we already know how to answer. Is that why you’re here? To answer questions we already know how to answer? Whether you are studying Physics or English or Business, all your instructors are trying to help you learn how to answer questions for yourself. Part of that training involves questions we already understand, because that’s an effective way of learning processes that can be applied to questions we don’t understand. This is one of the most important aspects of your college education and it takes practice. Asking an LLM may or may not generate a correct answer, but either way it prevents you from practicing and learning those processes.

To make matters worse, there is now research claiming that frequent use of LLMs has neurological and behavioral consequences. One recent study [1] finds significant cognitive debt and consistent underperformance compared to peers that do not rely on these systems. That is a steep price for a momentary convenience. So I can’t stop you from using an LLM, but I would urge you to consider the long term cost.

[1] Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task https://arxiv.org/abs/2506.08872

Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task

This study explores the neural and behavioral consequences of LLM-assisted essay writing. Participants were divided into three groups: LLM, Search Engine, and Brain-only (no tools). Each completed three sessions under the same condition. In a fourth session, LLM users were reassigned to Brain-only group (LLM-to-Brain), and Brain-only users were reassigned to LLM condition (Brain-to-LLM). A total of 54 participants took part in Sessions 1-3, with 18 completing session 4. We used electroencephalography (EEG) to assess cognitive load during essay writing, and analyzed essays using NLP, as well as scoring essays with the help from human teachers and an AI judge. Across groups, NERs, n-gram patterns, and topic ontology showed within-group homogeneity. EEG revealed significant differences in brain connectivity: Brain-only participants exhibited the strongest, most distributed networks; Search Engine users showed moderate engagement; and LLM users displayed the weakest connectivity. Cognitive activity scaled down in relation to external tool use. In session 4, LLM-to-Brain participants showed reduced alpha and beta connectivity, indicating under-engagement. Brain-to-LLM users exhibited higher memory recall and activation of occipito-parietal and prefrontal areas, similar to Search Engine users. Self-reported ownership of essays was the lowest in the LLM group and the highest in the Brain-only group. LLM users also struggled to accurately quote their own work. While LLMs offer immediate convenience, our findings highlight potential cognitive costs. Over four months, LLM users consistently underperformed at neural, linguistic, and behavioral levels. These results raise concerns about the long-term educational implications of LLM reliance and underscore the need for deeper inquiry into AI's role in learning.

arXiv.org

@mcnees Whenever I see people talking about using LLMs to do school/college assignments, all I can think about is this (admittedly cringe, but also somewhat true, much more so when recontextualised to education) copypasta:

You cheated not only the game, but yourself.

You didn't grow.
You didn't improve.
You took a shortcut and gained nothing.

You experienced a hollow victory.
Nothing was risked and nothing was gained.

It's sad that you don't know the difference.

@mcnees @[email protected]

German translation:

Sollte ich ChatGPT oder einen anderes LLM verwenden um zu studieren?

Ich würde es nicht empfehlen. Ich versuche, mich über die Möglichkeiten der gängigen LLM-Systeme auf dem Laufenden zu halten. Sie können einiges wirklich gut, wenn man sie richtig einsetzt. Allerdings machen sie häufig Fehler bei der Beantwortung physikalischer Fragen. Manchmxal sind diese Fehler offensichtlich, manchmal subtil und schwer zu erkennen. Die Tatsache, dass man den Ergebnissen von LLM-Systemen nicht trauen kann, sollte Grund genug sein, sich beim Erlernen eines neuen Themas nicht auf diese Systeme zu verlassen.

Das ist aber nicht das einzige Problem. Die Interaktion mit LLMs fühlt sich wie ein Dialog an, daher liegt die Annahme nahe, dass die üblichen Gesprächsregeln gelten. Man stellt eine Frage und erwartet die Erwiederung ist eine Antwort auf die Frage. Es ist jedoch wichtig zu verstehen, dass dies nicht der Fall ist. Ein LLM ist darauf ausgelegt, statistisch wahrscheinliche Antworten auf die Frage „Wie würde eine Antwort auf diese Anfrage klingen?“ zu generieren. Dies ist nicht dasselbe wie die Beantwortung der Frage. Die Antwort kann das Gesuchte liefern, muss es aber nicht. Das ist ein Grund dafür, dass die Ausgabe eines LLM verbindlich klingt, selbst wenn sie falsch ist, und entschuldigend, wenn man auf Fehler hinweist. Sie ist weder verbindlich noch entschuldigend, und sie „denkt“ auch nicht über Ihre Frage nach. Es handelt sich lediglich um Antworten, die am besten zu einem sehr komplexen Satz von Wahrscheinlichkeitskriterien passen.

Ein größeres Problem ist, dass die Verwendung eines LLM den Prozess des Durchdenkens von Fragen und der Entwicklung von Strategien zu deren Beantwortung abkürzt. Es ist nicht so, dass ein LLM nie richtig liegt; oft liefern sie korrekte Ergebnisse. Korrekte Ergebnisse beschränken sich jedoch auf das Material in den Trainingsdaten des Modells – Fragen, deren Beantwortung wir bereits kennen. Sind Sie deshalb hier? Um Fragen zu beantworten, deren Antworten wir bereits kennen? Egal ob Sie Physik, Englisch oder Wirtschaft studieren, alle Ihre Dozenten versuchen Ihnen dabei zu helfen, zu lernen, Fragen selbst zu beantworten. Ein Teil dieser Ausbildung besteht aus Fragen, die wir bereits verstehen, denn das ist eine effektive Methode, um Lernprozesse zu üben, die sich auch auf unbekannte Fragen anwenden lassen. Dies ist einer der wichtigsten Aspekte Ihrer Hochschulausbildung und erfordert Übung. Die Befragung eines LLM mag zwar eine richtige Antwort liefern, muss es aber nicht – in jedem Fall hindert sie Sie daran, diese Prozesse zu üben und zu erlernen.

Zu allem Übel gibt es nun Forschungsergebnisse, die belegen, dass die häufige Nutzung von Lernmanagementsystemen neurologische und verhaltensbezogene Folgen hat. Eine aktuelle Studie fand erhebliche kognitive Defizite und dauerhaft geringere Leistungen im Vergleich zu Gleichaltrigen, die nicht auf diese Systeme angewiesen sind. Das ist ein hoher Preis für einen kurzfristigen Komfort.

Also, ich kann Sie nicht davon abhalten, einen LLM zu verwenden, aber ich möchte Sie dringend bitten, die langfristigen Folgen zu bedenken.