Mastodawn

So, wait, the whole “Mythos AI is so powerful it can find exploits in any software” thing requires both access to the source code and thousands of runs to find anything remotely actionable? This is the “too dangerous to release” model they’ve been hyping up?

Is that really it?

Show thread

yosh 1d ago

@baldur

Idk what 0-day exploits are going for these days, but from what I recall it could be north of a million USD depending on the scope and impact.

In comparison: spending 10k USD to find a 0-day RCE in a popular open source program seems like a bargain. I think it's less about the efficiency of the system and more about: "What are the odds an attacker with a credit card could make this your problem?"

Show thread

yosh 1d ago

@baldur

Like, I'd really like to point people at this:

https://toot.yosh.is/@yosh/116376054778890780

Anyone saying stuff like "oh well a fuzzer would have found that" is wish casting. Sure, these things will find the obvious lowest hanging fruit first. But they can also find sandbox escapes in formally verified code in memory safe languages written by some of the best to ever do it, hooked up to fuzzers 24/7.

I don't like it either. But that doesn't mean it's real.

yosh (@[email protected])

Big new Wasmtime security release today - 11 new CVEs found including 2 critical ones using LLMs. https://bytecodealliance.org/articles/wasmtime-security-advisories If LLMs can find this many critical bugs in a project that is as rigorous about security as Wasmtime, then get ready for projects with weaker security postures to do a lot worse. Like,,, actually.

Mastodon

Show thread

tkissing 1d ago

@yosh @baldur Quoting from https://bytecodealliance.org/articles/wasmtime-security-advisories
"However, there was no fuzzing to check that invalid strings are handled correctly, and each of these issues could have concievably been discovered if such a fuzzing harness had been written."
And further more:
"Upon updating the formal model to check against the latest Cranelift lowering rules, verification flags the same bug as was found with the LLM search."

This is not a slam dunk for LLMs over traditional methods.

Wasmtime's April 9, 2026 Security Advisories

A new world for security-critical projects

Bytecode Alliance

Show thread

yosh 1d ago

@tkissing @[email protected]

You're missing the point. It's not if/or. It's: "How much effort does it take for an attacker to point this at a program and find problems".

Wasmtime represents a best-case scenario where the maintainers have fuzzed the entire thing as much as they could, and even there it found problems. The maintainers are going to fix those problems and fill the gaps in fuzzing and that's good.

But most projects aren't even close to this, and yeah, I'm not optimistic about how that'll go.

Show thread

tkissing 1d ago

@yosh Attackers could have used fuzzers to find some if not all of these. Might require a bit more expertise, but I'm not even sure about that. It seems the people who built the LLM tooling to find these issues are pretty much experts and spent considerable time and effort.
I'm not saying LLMs can't find anything exploitable, but I'm doubting that it's as easy as putting "find me a zero day in Chrome" into a prompt and be done.

Show thread

yosh 1d ago

@tkissing

I've been told it literally is that easy - that's run in a loop with some additional deduplication and reporting code tacked on. I wouldn't be worried if it wasn't.

Read the intro to the post again. "It's a new world" is not hyperbole by a bunch of AI boosters. This is what Team Fuzzer is saying after having been on the receiving end of these tools.

Show thread

NosirrahSec 🏴‍☠️ guillotine enthusiast

@yosh @tkissing "I've been told..."

I'm gonna stop you right there.

Just take the L.

Show thread

Victor Westerhuis 1d ago

@NosirrahSec @yosh "... at Microsoft". Yeah, I don't think Microslop employees have any credibility left, if they ever had any.

Show thread

NosirrahSec 🏴‍☠️ guillotine enthusiast 1d ago

@viccie30 @yosh I am sure a great many there aren't shit people, but that number is probably dwindling as they suck down more "AI" loads.