Mastodawn

dominicq 19h ago

Small models also found the vulnerabilities that Mythos found

https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jagged-frontier

AI Cybersecurity After Mythos: The Jagged Frontier

Why the moat is the system, not the model

AISLE

Show thread

johnfn 18h ago

The Anthropic writeup addresses this explicitly:

> This was the most critical vulnerability we discovered in OpenBSD with Mythos Preview after a thousand runs through our scaffold. Across a thousand runs through our scaffold, the total cost was under $20,000 and found several dozen more findings. While the specific run that found the bug above cost under $50, that number only makes sense with full hindsight. Like any search process, we can't know in advance which run will succeed.

Mythos scoured the entire continent for gold and found some. For these small models, the authors pointed at a particular acre of land and said "any gold there? eh? eh?" while waggling their eyebrows suggestively.

For a true apples-to-apples comparison, let's see it sweep the entire FreeBSD codebase. I hypothesize it will find the exploit, but it will also turn up so much irrelevant nonsense that it won't matter.

Show thread

rakel_rakel

Spending $20000 (and whatever other resources this thing consumes) on a denial of service vulnerability in OpenBSD seems very off balance to me.

Given the tone with which the project communicates discussing other operating systems approaches to security, I understand that it can be seen as some kind of trophy for Mythos.
But really, searching the number of erratas on the releases page that include "could crash the kernel" makes me think that investing in the OpenBSD project by donating to the foundation would be better than using your closed source model for peacocking around people who might think it's harder than it is to find such a bug.

Show thread

paulddraper 14h ago

You don’t see the value of vulnerabilities as on the order of 20k USD?

When it’s a security researcher, HN says that’s a squalid amount. But when its a model, it’s exorbitant.

Show thread

rakel_rakel 14h ago

If I understand you correctly, you're asking me if I would class this as a 20k USD (plus environmental and societal impact) bug? nope, I don't.

I've not said anything else than that I think this specific bug isn't worth the attention it's getting, and that 20k USD would benefit the OpenBSD project (much) more through the foundation.

> When it’s a security researcher, HN says that’s a squalid amount. But when its a model, it’s exorbitant.

Not sure why you're projecting this onto me, for the project in question $20k is _a_lot_. The target fundraising goal for 2025 was $400k, 5% of that goes a very long way (and yes, this includes OpenSSH).

Show thread

telotortium 12h ago

[delayed]