currently playing "where the fuck's the beef" with claude mythos amongst all the proclamations of THIS IS IT. the openbsd "zero day" does not AIUI in fact appear to be one, for example - just a non-exploitable bug. what about these much hyped claims checks out?

EDIT: so far finding *none* of this checks out at all. it's the loudest AI hype this week and it seems to be a nothing burger. still open to non-nothings, of course.

using a chatbot as an expensive fuzzer, fine i guess. i would actually like price numbers on what finding each of these bugs would have cost. i saw some uncited cost numbers in chats, but not the sources for those cost numbers.

@davidgerard the worst kind of burger, of the nothing variety

@davidgerard

Theo de Raadt currently tooling himself up and heading to the Anthropic office

@davidgerard they literally called it "mythos"?

@davidgerard some other free names they could use in the future

Claude Fairytale
Claude Ozymandias
Claude Godot

@davidgerard so much security snake oil
Vulnerability Research Is Cooked — Quarrelsome

@dubiousblur I'm not seeing the beef there, but I'm seeing the hype.

> I got to talk with Nicholas Carlini at Anthropic about this

uh huh

it's an expensive fuzzer. that's not nothing, but it's no manner of paradigm shift.

@davidgerard @dubiousblur "Agents are uncannily skilled at software development"
@davidgerard NGL I'd really like to see someone properly document the details because, after a cursory search, it looks like everyone got their knickers in a twist from one Anthropic guy blabbing at a conference with AI-generated slides.

@art_codesmith got a link on that one?

one thing i am trying to find is the cost of this

@davidgerard This is the article I found from my cursory search: https://mtlynch.io/claude-code-found-linux-vulnerability/
Claude Code Found a Linux Vulnerability Hidden for 23 Years

Claude Code has gotten extremely good at finding security vulnerabilities, and this is only the beginning.

@davidgerard

Got a reference on the openbsd nothingburger assessment? Would be very helpful to me.

@dashdsrdash well it's a bug, but if it was an exploit the front page would have changed
@davidgerard in addition to the financial costs I’d be interested in the false positive rate. I suspect it’s something akin to throwing a dart at a printout of the code.

@spzb oh anthropic admits that! they can't tell which bugs are real, so they send a pile of shit to humans to pick through

https://red.anthropic.com/2026/mythos-preview/

> We triage every bug that we find, then send the highest severity bugs to professional human triagers to validate before disclosing them to the maintainer.

Claude Mythos Preview \ red.anthropic.com

@[email protected] @[email protected] Oh, so it's another ruse to sucker people into training their AI for them for no pay?
@abucci @spzb now you might think that
@davidgerard @spzb Meanwhile they begin to get so lazy they stop looking for the stuff the AI can’t find
@davidgerard “Ford, there’s an infinite number of monkeys outside that want to talk to us about some code vulnerabilities they’ve found out”
@davidgerard @spzb So, more Actually Indians in the AI pipeline.
@davidgerard I have indeed been wondering since I trust OpenBSD to actually detail a problem vs Anthropic’s “booga booga!” declarations
@arrjay the bug is a crash bug, and apparently it exists. openbsd doens't worry about obscure such bugs except to fix them. wonder if there'll be an ack.
@avuko yeah that's hilarious. the Model Too Cool To Release doesn't make a difference lol
@davidgerard is it known yet whether they run the mythical fuzzer on the code or the executable (in other words is it white box or black box)
@mcc white box! it's an expensive static checker to be precise, not really a fuzzer. i've made that clear in the piece.

@davidgerard The whole zero-day claims reminds me of an old article about how WIndows ME (or whatever) was released with 14,000 bugs (or however many) because that's how many issues were in the issue tracker. It didn't matter that almost all of them were minor issues that weren't necessarily "bugs".

"-117% of all statistics are made up."

@davidgerard Isn't this just the billion-dollar corporate version of "I ran your source code through ChatGPT and here's 730 pull requests for you to go through."
@davidgerard the cost question is the right one to ask - and the real cost is probably a big secret.
@Frischling $20k per 1000 runs. that's the subsidised cost of course.
@davidgerard indeed, if true than it is very expensive in every respect (according to Antheopic themselves); it’s probably a whole lot cheaper to get a huge workforce of “A Guy Instead”(intelligent and well educated people!) to do this job.