The era of ChatGPT is kind of horrifying for me as an instructor of mathematics... Not because I am worried students will use it to cheat (I don't care! All the worse for them!), but rather because many students may try to use it to *learn*.

For example, imagine that I give a proof in lecture and it is just a bit too breezy for a student (or, similarly, they find such a proof in a textbook). They don't understand it, so they ask ChatGPT to reproduce it for them, and they ask followup questions to the LLM as they go.

I experimented with this today, on a basic result in elementary number theory, and the results were disastrous... ChatGPT sent me on five different wild goose-chases with subtle and plausible-sounding intermediate claims that were just false. Every time I responded with "Hmm, but I don't think it is true that [XXX]", the LLM responded with something like "You are right to point out this error, thank you. It is indeed not true that [XXX], but nonetheless the overall proof strategy remains valid, because we can [...further gish-gallop containing subtle and plausible-sounding claims that happen to be false]."

I know enough to be able to pinpoint these false claims relatively quickly, but my students will probably not. They'll instead see them as valid steps that they can perform in their own proofs.

I see so many adults and professionals talking about how they are using LLMs to deepen their understanding of things, but I think this ultimately dives headlong into the “Gell-Mann amnesia” effect — these people think they are learning, but it only feels that way because there are ignorant enough about the topic they're interested in to not detect that they are being fed utter bullshit.

How shall we answer this? I think it speaks most urgently for people who actually know things, those with "intellectual power", to democratise our knowledge, throw aside the totems that make our fields inaccessible and obscure, and open the gates to the multitudes who wish to learn.

At first it seems like it would be easy to compete with LLMs (because they say only bullshit), but to actually compete with LLMs we need to produce educational materials that actually explain things properly. Any 'proof by intimidation' will immediately send our student to the LLM. The moment you rely on something that you haven't explained, same deal. So it may be that this era has a silver lining: we must finally teach mathematics properly.

@jonmsterling There has been some good research done and papers written about this topic in the last year: https://arxiv.org/abs/2404.03502
AI and the Problem of Knowledge Collapse

While artificial intelligence has the potential to process vast amounts of data, generate new insights, and unlock greater productivity, its widespread adoption may entail unforeseen consequences. We identify conditions under which AI, by reducing the cost of access to certain modes of knowledge, can paradoxically harm public understanding. While large language models are trained on vast amounts of diverse data, they naturally generate output towards the 'center' of the distribution. This is generally useful, but widespread reliance on recursive AI systems could lead to a process we define as "knowledge collapse", and argue this could harm innovation and the richness of human understanding and culture. However, unlike AI models that cannot choose what data they are trained on, humans may strategically seek out diverse forms of knowledge if they perceive them to be worthwhile. To investigate this, we provide a simple model in which a community of learners or innovators choose to use traditional methods or to rely on a discounted AI-assisted process and identify conditions under which knowledge collapse occurs. In our default model, a 20% discount on AI-generated content generates public beliefs 2.3 times further from the truth than when there is no discount. An empirical approach to measuring the distribution of LLM outputs is provided in theoretical terms and illustrated through a specific example comparing the diversity of outputs across different models and prompting styles. Finally, based on the results, we consider further research directions to counteract such outcomes.

arXiv.org