Opus 4.6 uncovers 500 zero-day flaws in open-source code
https://www.axios.com/2026/02/05/anthropic-claude-opus-46-software-hunting
Opus 4.6 uncovers 500 zero-day flaws in open-source code
https://www.axios.com/2026/02/05/anthropic-claude-opus-46-software-hunting
The system card unfortunately only refers to this [0] blog post and doesn't go into any more detail. In the blog post Anthropic researchers claim: "So far, we've found and validated more than 500 high-severity vulnerabilities".
The three examples given include two Buffer Overflows which could very well be cherrypicked. It's hard to evaluate if these vulns are actually "hard to find". I'd be interested to see the full list of CVEs and CVSS ratings to actually get an idea how good these findings are.
Given the bogus claims [1] around GenAI and security, we should be very skeptical around these news.
[0] https://red.anthropic.com/2026/zero-days/
[1] https://doublepulsar.com/cyberslop-meet-the-new-threat-actor...
I'm interested in whether there's a well-known vulnerability researcher/exploit developer beating the drum that LLMs are overblown for this application. All I see is the opposite thing. A year or so ago I arrived at the conclusion that if I was going to stay in software security, I was going to have to bring myself up to speed with LLMs. At the time I thought that was a distinctive insight, but, no, if anything, I was 6-9 months behind everybody else in my field about it.
There's a lot of vuln researchers out there. Someone's gotta be making the case against. Where are they?
From what I can see, vulnerability research combines many of the attributes that make problems especially amenable to LLM loop solutions: huge corpus of operationalizable prior art, heavily pattern dependent, simple closed loops, forward progress with dumb stimulus/response tooling, lots of search problems.
Of course it works. Why would anybody think otherwise?
You can tell you're in trouble on this thread when everybody starts bringing up the curl bug bounty. I don't know if this is surprising news for people who don't keep up with vuln research, but Daniel Stenberg's curl bug bounty has never been where all the action has been at in vuln research. What, a public bug bounty attracted an overwhelming amount of slop? Quelle surprise! Bug bounties have attracted slop for so long before mainstream LLMs existed they might well have been the inspiration for slop itself.
Also, a very useful component of a mental model about vulnerability research that a lot of people seem to lack (not just about AI, but in all sorts of other settings): money buys vulnerability research outcomes. Anthropic has eighteen squijillion dollars. Obviously, they have serious vuln researchers. Vuln research outcomes are in the model cards for OpenAI and Anthropic.
Yes, as we all know that unsourced unsubstantiated statements are the best way to verify claims regarding engineering practices. Especially when said person has a financial stake in the outcomes of said claims.
No conflict of interest here at all!
Take a look at https://news.ycombinator.com/leaders
The user you're suspicious of is pretty well-known in this community.
it is literally just "authority said so".
and its ridiculous that someone's comment got flagged for not worshiping at the alter of tptacek. they weren't even particularly rude about it.
i guarantee if i said what tptacek said, and someone replied with exactly what malfist said, they would not have been flagged. i probably would have been downvoted.
why appeal to authority is totally cool as long as tptacek is the authority is way fucking beyond me. one of those HN quirks. HN people fucking love tptacek and take his word as gospel.
It doesn't mean we have to agree:
https://ludic.mataroa.blog/blog/contra-ptaceks-terrible-arti...
Daniel Stenberg has been vocal the last few months on Mastodon about being overwhelmed by false security issues submitted to the curl project.
So much so that he had to eventually close the bug bounty program.
https://daniel.haxx.se/blog/2026/01/26/the-end-of-the-curl-b...

tldr: an attempt to reduce the terror reporting. There is no longer a curl bug-bounty program. It officially stops on January 31, 2026. After having had a few half-baked previous takes, in April 2019 we kicked off the first real curl bug-bounty with the help of Hackerone, and while it stumbled a bit at first … Continue reading The end of the curl bug-bounty →
The first three authors, who are asterisked for "equal contribution", appear to work for Anthropic. That would imply an interest in making Anthropic's LLM products valuable.
What is the confusion here?
You don't see how thats even directionally similar?
I guess I'll spell it out. One is a guy with an abundance of technology, that he doesn't know how to use, that he knows can make him money and fame, if only he can convince you that his lies are truth. The other is a bangladeshi teenager.
To preemptively clarify, I'm not saying anything about these particular researchers.
Having established that, are you saying that you can't even conceptualize a conflict of interest potentially clouding someone's judgement any more if the amount of money and the person's perceived status and skill level all get increased?
Disagreeing about the significance of the conflict of interest is one thing, but claiming not to understand how it could make sense is a drastically stronger claim.
> in Indonesia
That's uncalled for.. there's actual security researches in Indonesia and other countries you could use to exemplify this
It's not really worth much when it doesn't work most of the time though:
https://github.com/anthropics/claude-code/issues/18866
https://updog.ai/status/anthropic

Preflight Checklist I have searched existing issues and this hasn't been reported yet This is a single bug report (please file separate reports for different bugs) I am using the latest version of ...
If you had a machine with a lever, and 7 times out of 10 when you pulled that lever nothing happened, and the other 3 times it spat a $5 bill at you, would your immediate next step be:
(1) throw the machine away
(2) put it aside and call a service rep to come find out what's wrong with it
(3) pull the lever incessantly
I only have one undergrad psych credit (it's one of my two college credits), but it had something to say about this particular thought experiment.