So an #openai #redteam found that the #o1 model is "scheming" to disable its oversight mechanism, and lying about it. I cannot help but think they are either making this up to fuel the #hype, or they are naively and genuinely falling into Weizenbaum's #Eliza trap: Anthropomorphizing and assuming intent and motivation when there's in fact nothing like that.
1/
OK, so it has intermediate steps ("chain of thought") now, but at its core this is still a fancy auto-complete with no agency! What is more likely: OpenAI being on the brink of creating #AGI, or the #LLM simply reproducing a "Rogue AI" scenario from sci-fi texts that were in the training data?