Small models also found the vulnerabilities that Mythos found
https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jagged-frontier
Small models also found the vulnerabilities that Mythos found
https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jagged-frontier
> We took the specific vulnerabilities Anthropic showcases in their announcement, isolated the relevant code, and ran them through small, cheap, open-weights models. Those models recovered much of the same analysis. Eight out of eight models detected Mythos's flagship FreeBSD exploit, including one with only 3.6 billion active parameters costing $0.11 per million tokens.
Impressive, and very valuable work, but isolating the relevant code changes the situation so much that I'm not sure it's much of the same use case.
Being able to dump an entire code base and have the model scan it is they type of situation where it opens up vulnerability scans to an entirely larger class of people.
This is from the first of the caveats that they list:
> Scoped context: Our tests gave models the vulnerable function directly, often with contextual hints (e.g., "consider wraparound behavior"). A real autonomous discovery pipeline starts from a full codebase with no hints. The models' performance here is an upper bound on what they'd achieve in a fully autonomous scan. That said, a well-designed scaffold naturally produces this kind of scoped context through its targeting and iterative prompting stages, which is exactly what both AISLE's and Anthropic's systems do.
That's why their point is what the subheadline says, that the moat is the system, not the model.
Everybody so far here seems to be misunderstanding the point they are making.
> That's why their point is what the subheadline says, that the moat is the system, not the model.
Can you expand a bit more on this? What is the system then in this case? And how was that model created? By AI? By humans?
You can imagine a pipeline that looks at individual source files or functions. And first "extracts" what is going on. You ask the model:
- "Is the code doing arithmetic in this file/function?"
- "Is the code allocating and freeing memory in this file/function?"
- "Is the code the code doing X/Y/Z? etc etc"
For each question, you design the follow-up vulnerability searchers.
For a function you see doing arithmetic, you ask:
- "Does this code look like integer overflow could take place?",
For memory:
- "Do all the pointers end up being freed?"
_or_
- "Do all pointers only get freed once?"
I think that's the harness part in terms of generating the "bug reports". From there on, you'll need a bunch of tools for the model to interact with the code. I'd imagine you'll want to build a harness/template for the file/code/function to be loaded into, and executed under ASAN.
If you have an agent that thinks it found a bug: "Yes file xyz looks like it could have integer overflow in function abc at line 123, because...", you force another agent to load it in the harness under ASAN and call it. If ASAN reports a bug, great, you can move the bug to the next stage, some sort of taint analysis or reach-ability analysis.
So at this point you're running a pipeline to:
1) Extract "what this code does" at the file, function or even line level.
2) Put code you suspect of being vulnerable in a harness to verify agent output.
3) Put code you confirmed is vulnerable into a queue to perform taint analysis on, to see if it can be reached by attackers.
Traditionally, I guess a fuzzer approached this from 3 -> 2, and there was no "stage 1". Because LLMs "understand" code, you can invert this system, and work if up from "understanding", i.e. approach it from the other side. You ask, given this code, is there a bug, and if so can we reach it?, instead of asking: given this public interface and a bunch of data we can stuff in it, does something happen we consider exploitable?