I wrote a post about something that's a little hidden in Anthropic's focus on cybersecurity for its next model. Then I did something which turned the whole thing a little more dark.

https://www.ianbetteridge.com/the-worst-of-us/

The worst of us

Credit where it's due. Anthropic's system card for Claude Mythos Preview is a genuinely interesting and thoughtful document. Most AI companies publish safety evaluations the way governments publish freedom of information responses. They're technically compliant but strategically uninformative. And that's usually deliberate, because the last thing that's good for business

Ian Betteridge
@ianbetteridge
Flippin’ hell, Tucker!

@ianbetteridge
"Did Claude just blame its parents for its failings? I think it might have."

It's a large language model. It learned it from all the copyrighted material its company stole and the internet.

@ianbetteridge This article is interesting, and points out some problematic actions initiated by LLM products, but it completely errs on the "motivation." It's not that the model is "trained" to have "behaviors" based on human activities, rather it's that the Prompt And Hope model of extracting a pattern from the model is an extremely inaccurate and blunt tool and does not (and mathematically, cannot) be made declarative enough to prevent undesirable actions or answers from the model.
@ianbetteridge That was a great read with a fine twist in its tail.
@ianbetteridge Boris Johnson as a Service.
@otfrom That's all the world needs