After writing about people going into delusional spirals with ChatGPT and having what look like mental breakdowns, I wanted to understand exactly how it happens.

A corporate recruiter in Toronto who spent 3 weeks convinced by ChatGPT that he was essentially Tony Stark from Iron Man, agreed to share his transcript after breaking free of the delusion.

We analyzed the transcript & shared it with experts. Now you can see the interactions & how delusional spirals happen:
https://www.nytimes.com/2025/08/08/technology/ai-chatbots-delusions-chatgpt.html?unlocked_article_code=1.ck8.FEwL.MLb9ajaocyTx&smid=url-share

@kashhill

Great read, thank you for writing and sharing this. It was interesting that a lot of the quotes that you sourced from other people were also playing into the narrative of the LLM vendors. For example:

Amanda Askell, who works on Claude’s behavior at Anthropic, said that in long conversations it can be difficult for chatbots to recognize that they have wandered into absurd territory and course correct

Note the terminology here: 'difficult for chatbots to recognize...'. It's not that this is a difficult thing for an LLM to recognise, it is that recognising things is fundamentally not something that text extrusion machines do. The same with this bit:

A Google spokesman pointed to a corporate page about Gemini, that warns that chatbots “sometimes prioritize generating text that sounds plausible over ensuring accuracy.”

No. They don't sometimes prioritise generating text that sounds plausible over ensuring accuracy, they always generate text that sounds plausible, that is what an LLM is: a machine for generating text from a high-probability space as defined by their training data. By coincidence, this is often factually accurate, but they do not have any way of determining that.

Even some of your own text is anthropomorphising LLMs:

The reason Gemini was able to recognize and break Mr. Brooks’s delusion

Again, no. Gemini didn't recognise the delusion, it was simply that starting with a delusion the highest-probability next text was a report that it was probably delusional.

Including the final paragraph:

“It’s a dangerous machine in the public space with no guardrails,” he said. “People need to know.”

Guardrails are a marketing term (as is 'AI', as @emilymbender has described at length). Fundamentally, these machines are just producing text. The way that they are marketed is to pretend that they are interlocutors for the user. That UI model is the core of this danger. Adding 'guardrails' is the thing that vendors do to avoid addressing the root problem, because if they did address that problem then people would realise that bullshit generators are not reliable tools.