a number of people have expressed a very strong distaste for building systems that automatically generate exploits. I think there's a bit of nuance here that isn't immediately obvious

if you are going to build any kind of LLM-driven (more generally, fuzzer-driven) system that searches for security issues, you really want it to produce exploits (automatically or otherwise), because if it doesn't, you'll end up swamping people doing triage with a massive wave of invalid bug reports. it would be worse than no such thing at all

in a way, the exploit-generation capability (which is not new nor was it invented for AI stuff, DARPA has been working on this capability for ages) reduces maintainer burnout. you may have seen the Curl maintainer talk about that. I think that's important to consider

@whitequark

Do you think that generating ways to cause a crash instead of causing an RCE --when possible-- would work, or would still be bad for prioritization (due to not giving a reliable way of identifying which issues do actually lead to RCE)?
@robryk @whitequark generating crashing inputs is what fuzzing has been doing for years and unless the fuzzer user triages for actual issues it puts heavy burden on the maintainer. generating exploits shows the issue is real and maybe worth investing time in for a maintainer
@annanannanse @robryk also most of the reports you're going to get from these tools are for crashes. the combination of W^X/ASLR/SSP/etc mitigations with the general distribution of bugs means that the majority of security issues are 'just' a DoS. however 'just' a DoS can still be paralyzing, and it's not clear to me that e.g. if a skiddy can generate hundreds of distinct crashes in someone's software and then exploit them one by one it will be all that much better than RCE
@robryk @whitequark I think
• I've seen too many cases of "out of bounds write but we don't think it could cause RCE" that ends up being RCE.
• Hostile(/"friendly") governments are gonna do that anyway
• I'm pretty sure "how do we automatically turn a crash into RCE" is an active area of research, especially for aforementioned governments or their suppliers
• It's adjacent to "publishing exploits is bad, only companies and governments should be allowed to see them" and I am really not a fan of where that leads.
@whitequark Yeah for me a fuzzer generated bug report where feeding a recursive descent parser 30,000 left parens leads to a stack overflow is not that interesting and in fact I've closed such bug reports. On the other hand running ubsan in CI caught a genuine OOB write where a fixed size array was too small and that was causing some real havoc.

@whitequark

I’ve said this elsewhere, but high false positives are a real problem for static analysers. I’d love to see a system that would run the clang analyser and, for each reported bug, create a version of the code with trace points on each branch leading to the bug. Then use a guided fuzzer to try to find a test case that reached that case.

I suspect it would be both cheaper and more reliable than LLM-based approaches.

We use static analysis in CI, but that’s possible only because we’re a smallish and young project and so can handle the relatively small number of reports. Doing the same on a project like, say, Linux results in so many might-be-a-bug reports that it’s too much effort to triage them all.

@whitequark

not really, generating a crash is a sufficient POC and can be miles and miles easier than a full exploit

the presence of a file you can just run and get RCE makes skids flock to it like fireflies, and ive seen the difference between a sentence describing a trivial exploit vs it being implemented myself
@m that is myopic, RCE is not the only type of exploit that matters. outside of native code you almost never get RCE but there are still plenty of incredibly consequential issues like "authorization bypass"

@whitequark I think @m is correct when it comes to memory corruption. As you pointed out, memory corruption is not at all the only case, but it is special for two reasons:

  • It is a very large fraction of vulnerabilities.
  • There are tools (sanitizers, Valgrind) that detect it and have zero false positives (modulo bugs in the tools).
  • This is extremely helpful for automated tooling and triage.

    @whitequark sure, for different systems where the easy-to-verify to hard-to-exploit tradeoff is different, the answer might be different, but the answer is definitely not finetuning LLMs to create exploits across the board (or at least, maintainer burden is not a good argument for doing so)
    @whitequark ā€œyou'll end up swamping people doing triage with a massive wave of invalid bug reports. it would be worse than no such thing at allā€ <— SpicyAutoCarrot TechBros are here