It's so cool that anthropic is setting up a double-sided protection racket where it will profit from the massive token burn of attackers and defenders with a tool specifically designed to generate exploits and their only observable mitigation is a clientside system prompt that sternly warns the LLM to be good and not do malware
https://red.anthropic.com/2026/mythos-preview/
Claude Mythos Preview \ red.anthropic.com

@jonny It's good that you're taking on board the state of Anthropic's server backend code has nothing to say about the value of their LLM offering.
@hopeless having difficulty parsing the sentence, but yeah given the nature of LLMs it does indeed matter what deterministic code wraps them and enacts their ability to do stuff like "execute commands" and "write code" and whatnot, to say nothing of being material safeguards both as literal filters of behavior but also orchestrators of the multi-agent chains that seem to be necessary to keep these things in the bounds of plausible behavior
@jonny Yeah... my point is... in the era of OpenClaw, it seems none of that affected Anthropic's ability to produce an effective hacking machine, affects their LLM usefulness, affects their profits... or generally matters.
@hopeless
if it didn't matter, there would be no system prompts, there would never need to be a new feature in Claude code, they would never need to develop and release a new product as they are doing now, because it would all be driven by the quality of the LLM right? You seem to think my argument is that "their non-LLM tech is bad, so therefore everything is bad" when what I am actually saying is "their non-LLM wrapping tech and how they are marketing is a signal for how they intend to use it as a profit taking tool, and their displayed competence and techniques do not inspire confidence that they can adequately safeguard it from being a weapon"
@hopeless
There are more thoughts in the world than whether something is good or bad