#Anthropic #MythosPreview #AI #Zerodays
@jlink @ewolff @chrisstoecker This isn't only hype. ChatGPT has teased AGI for 2 years. Both companies leak stupid things like how Claude is "anxious" to garner buzz. I am largely suspect of Claude, they have astroturfed the entire internet about their capabilities.
However, these tools have been finding exploits from the start, they are getting even better, cases are documented, piling up. Agreed - they are finding things others aren't, bypassing fuzzing. They are equally good at writing defects and buggy code - but I don't think this is only hype.
Anthropic says they are monitoring usage now they have realized this is dangerous. Using their API to probe things like the Linux kernel will get their attention they claim.
@jlink @ewolff @chrisstoecker Could be or that projects need to patch the issues before we tell the world how to exploit them. Simple search turns up stories with examples. There is a false choice here - it can be good at finding issues and still be imperfect - but I don't doubt the team is onto something in that these tools are finding things people can't...
https://venturebeat.com/security/anthropic-claude-code-security-reasoning-vulnerability-hunting
@ewolff @JoeHenzi @jlink @chrisstoecker They claim that they'll share a "cryptographic hash" of the details, which will be published later after some vulnerabilities have been fixed.
Hey, not knowing what's going on doesn't mean someone is hiding something from you. Few months ago a startup that has a ton less resources found 12 bugs in OpenSSL, software/code that has received more attention than most. But they also went through disclosure and were patched.
Even that company says it's not a replacement for human review - but it did something humans hadn't before. Ignoring doesn't do anything.
@ewolff Probably related: https://mastodon.social/@bagder/116362046377975050

@JoeHenzi You missed my point, which is: Daniel is the last to shill LLMs, *especially* in context of CVE reporting.
@chrisstoecker
at least the Linux Foundation and the Apache Software Foundation will benefit from it
I am wondering what all the armies on this planet are thinking about it
and the telcos
and the banks
and ...
@chrisstoecker well, let me put it that way:
- you will find exploits in any software with enough ressources and time
- sharing found zero days responsibly is a responsible thing to do
- the idea that they will only be used for defensive purposes sounds heroic but I hardly believe they won't share this technology with state actors (think about OpenAI which will do anyhting for money)
There is nothing new in these three statements but the pace is accelerating.
Same shit, faster.
@chrisstoecker The concerning aspect about this situation is not that systems have weaknesses but that defensive measures are always underpaid and understuffed.
AI hype does shift budgets from reasonable security measures like defense in depth (build systems that have more than on security guard in them) and slow but sustainable secure system designs to hyped and flaky "AI defense agents" – which will break sooner than later because they are erratic like LLMs. Fairy dust but not sustainable.
@schrotthaufen they do apparently, naturally with projects with some visibility like Firefox. This is also not blind prompting but Opus orchestrated by persons that have some intuition where in the codebase it’s worthwhile and will yield the desired results, but still. For instance: https://blog.mozilla.org/en/firefox/hardening-firefox-anthropic-red-team/
@chrisstoecker
Yes. That's what Veracode, Black Duck, Sonarqube and all the other vulnerability scanning tools do day after day. Since before Anthropic. So what.
But anyway… Maybe you want to talk to @bagder for some context on the quality of "AI" bug reports. He's received thousands of them. And he's not impressed.
@chrisstoecker Let me guess: The number of found "vulnerabilities" by this tool correlates with the lack of use of static code analysis and the number of disabled/ignored compiler warnings.
And I don't see any reason yet to assume that this tool exceeds the performance of already existing static code analysis tools like SonarQube, PMD or Findbugs/Spotbugs.
Meine Vermutung:
Mythos hat auch andere Code-Repositories auf Schlampigkeit geprüft.