It's so cool that anthropic is setting up a double-sided protection racket where it will profit from the massive token burn of attackers and defenders with a tool specifically designed to generate exploits and their only observable mitigation is a clientside system prompt that sternly warns the LLM to be good and not do malware
https://red.anthropic.com/2026/mythos-preview/
Claude Mythos Preview \ red.anthropic.com

sure they are doing """alignment""" to the models, and maybe they have some more sophisticated serverside mitigations. but the fact that the system prompt text is in the package at all rather than all being entirely serverside does the opposite of inspire confidence. Even the system prompt is fine with hacking as long as you go "it's ok I am good"
https://neuromatch.social/@jonny/116325221458366596
so this simultaneously raises the floor of doing open source at all to "if you can afford brute force generating exploits against your repos for days at a time" while simultaneously causing so many false positives that bug bounties are crumbling and the info giants will pull labor from open source projects by just generating them badly in-house - don't roll your own crypto becomes "now you have to roll your own crypto because nobody else is, and then pay an AI company to secure it for you."
The end of the curl bug-bounty

tldr: an attempt to reduce the terror reporting. There is no longer a curl bug-bounty program. It officially stops on January 31, 2026. After having had a few half-baked previous takes, in April 2019 we kicked off the first real curl bug-bounty with the help of Hackerone, and while it stumbled a bit at first … Continue reading The end of the curl bug-bounty →

daniel.haxx.se
you know that problem where it's actually in Google's best interests to sabotage their traditional search results to force everyone to use the AI results because then you never leave the site and direct prompt advertising becomes extremely valuable? yeah, it's like that for code, where it's actually in anthropic's best interests for all the code to be entirely unmaintainable and unsecurable except for with LLMs
@jonny the cyberpunk wiki is starting to look like the necronomicon now
@Viss i can't wait for the phase of the grift where "they can't control it" and release a series of whitepapers on how the only mitigation is to constantly refactor your code with a background churn of 10 exploit generation agents to not present a stable attack surface
@Viss like their entire corporate voice is laying the groundwork for one day claiming "hey everyone now that we are too big to fail and integrated everywhere, we are unhappy to announce that we have lost control of the models but can't shut them off because their so important and everyone needs to subscribe to our active countermeasures protection suite or a rogue AI that we are no longer responsible for will hack you."
@jonny @Viss Meanwhile, I'm in every vibe-coded web app going "../../../../../" popping 0-days lol.
(for the literal reader out there, i am not claiming this is actually their secret plan or whatever, i am saying that whenever anthropic goes like "we didn't fully understand the model..." or invoke emergence or otherwise write as if the model is some unknowable god, that's always in service of product