@kashhill
Great read, thank you for writing and sharing this. It was interesting that a lot of the quotes that you sourced from other people were also playing into the narrative of the LLM vendors. For example:
Amanda Askell, who works on Claudeās behavior at Anthropic, said that in long conversations it can be difficult for chatbots to recognize that they have wandered into absurd territory and course correct
Note the terminology here: 'difficult for chatbots to recognize...'. It's not that this is a difficult thing for an LLM to recognise, it is that recognising things is fundamentally not something that text extrusion machines do. The same with this bit:
A Google spokesman pointed to a corporate page about Gemini, that warns that chatbots āsometimes prioritize generating text that sounds plausible over ensuring accuracy.ā
No. They don't sometimes prioritise generating text that sounds plausible over ensuring accuracy, they always generate text that sounds plausible, that is what an LLM is: a machine for generating text from a high-probability space as defined by their training data. By coincidence, this is often factually accurate, but they do not have any way of determining that.
Even some of your own text is anthropomorphising LLMs:
The reason Gemini was able to recognize and break Mr. Brooksās delusion
Again, no. Gemini didn't recognise the delusion, it was simply that starting with a delusion the highest-probability next text was a report that it was probably delusional.
Including the final paragraph:
āItās a dangerous machine in the public space with no guardrails,ā he said. āPeople need to know.ā
Guardrails are a marketing term (as is 'AI', as @emilymbender has described at length). Fundamentally, these machines are just producing text. The way that they are marketed is to pretend that they are interlocutors for the user. That UI model is the core of this danger. Adding 'guardrails' is the thing that vendors do to avoid addressing the root problem, because if they did address that problem then people would realise that bullshit generators are not reliable tools.