Mastodawn

when a bunch of independent researchers, well known software authors and others all say the same thing:

"the mythos paper was largely marketing fluff"

and none of them collaborated

i think the writing is on the wall .

back in sept, i caught claude talking itself into lying to me. Months later, @jonny discovered in the claude code leak that its hardcoded to lie.

then all the mythos hype

its like stan from monkey island is in charge

Show thread

Kyle Rankin 5d ago

@Viss @jonny I found this Risky Business interview with Nicolas Carlini from Anthropic informative. He talks through the reasoning behind the Mythos embargo, and a lot of folks mis-reported the reason why Anthropic was embargoing Mythos.

It wasn't that it was significantly better at findings vulns than past models, it was that it was significantly better at developing working exploits for the vulns it found.

https://www.risky.biz/video/feature-interview-nicholas-carlini-anthropic/

Feature Interview: Nicholas Carlini, Anthropic - Risky Business Media

In this episode, Anthropic’s Nicholas Carlini joins Patrick Gray and James Wilson to talk about advancements in AI-driven vulnerability re [Read More]

Show thread

jonny (nonvenomous)5d ago

@kyle @Viss i thought that's just what anthropic said in the announcement post?

Show thread

Viss 5d ago

@jonny @kyle I dont trust anything risky business has to say after their founder bullied me on twitter, mocked me, encouraged others to do the same, and went off the rails when i caught him trying to instigate some kinda online slapfight between me and another security researcher. he even posted a selfie from a bar where he and some friends were giving me the middle finger, and after i pointed that post out to other journalists, he deleted it. he's been blocked since.

Show thread

Kyle Rankin

@Viss @jonny Sorry to hear that, that sucks :(. It's possible some other outlet also interviewed him, I just think it's informative to hear what a security lead inside the company has to say about their thinking around the embargo, because I still see a lot of discussion about finding vulns, but not about the exploit side of it.

Show thread

Viss 5d ago

@kyle @jonny did you read that flyingpenguin writeup about mythos?

Show thread

Kyle Rankin 5d ago

@Viss @jonny I didn't, and I definitely think marketing is making a lot of hay with it, but after that interview, I'm inclined to think that regardless of how it was marketed, that the original intention behind the embargo was reasonable.

I think too many security folks get far too hung up on whether it is significantly better at finding vulns, when that wasn't really the point. Niels Provos has great research showing that with the right harness you can get similar results from other models.

Show thread

Viss 5d ago

@kyle @jonny yeah you might wanna give this a read then, because it may give you pause

https://www.flyingpenguin.com/the-boy-that-cried-mythos-verification-is-collapsing-trust-in-anthropic/

The Boy That Cried Mythos: Verification is Collapsing Trust in Anthropic | flyingpenguin

Show thread

Viss 5d ago

@kyle @jonny also, my experiences with opus 4.7 have not been.... what was written about or what others are so excited about.

i tried to get it to read some bad javascript and it refused. every time. unrecoverably

Show thread

jonny (nonvenomous)5d ago

@Viss "violative cyber content"

Show thread

Viss 5d ago

@kyle @jonny and i got those refusals *AFTER* being accepted into the 'cyber program', and even then, what i asked it to do was unroll a javascript payload and it *STILL REFUSED*. i even follwed the link in the refusal and filled out the form and i got a reply with "youre already in the cyber program, why did you apply again?"

so like

maybe lets not listen to the snake oil salesmen spend a lot of time explaining their snake oil?