Mastodawn

AI, a few thoughts, observations about AI & security vulns.
My standard line about AI is "there's a lot I'm uncertain about". But let's be clear, there's a lot I don't like & I'm probably biased towards the "here's how spectacularly AI failed once again" news (of which there are plenty) or at least the "it's not as impressive as it may look".
Yet, I don't want to close my eyes if I see things that clearly don't fit my biases. And I know a thing or two about security vulnerabilities.🧵

The most visible thing how AI impacted security vulnerabilities early on were slop reports. Famously, @bagder shared plenty of experiences with AI written garbage reports.
But there's another more recent development. Real, and valuable security reports show up. I heard those starting early this year. Those were single instances, but they were clearly showing that there are companies out there developing tools that spit out real vulnerabilities, with proof of concepts, and sometimes even patches.

Something else happened, and that was *very* recently. Those reports grew in numbers.
if I see 1-2 valid reports in a major open source lib from an AI tool, I'm not impressed. If I had enough funding, I could find valid vulns in a variety of ways.
When the Mozilla/Antropic thing came out, that was what I was thinking. "Yeah, these are real bugs, but you know, if I had infinite funding like Antropic, and a team of top security people, you know how many bugs I could find in Firefox?"

But that wasn't an isolated development either. It's clearly showing up everywhere. I'm running out of reasons not to think that AI tools got really good at finding security vulnerabilities.

Obvious caveat: None of that changes that there are plenty of good reasons to be very worried about the whole AI thing.

FWIW: I don't have a big conclusion here, I'm just sharing random thoughts and observations. /end thread

Greg K-H Mar 27

@hanno It is real, see my interview in the Register today about this very problem: https://www.theregister.com/2026/03/26/greg_kroahhartman_ai_kernel/

AI bug reports went from junk to legit overnight, says Linux kernel czar

Interview: Greg Kroah-Hartman can't explain the inflection point, but it's not slowing down or going away

The Register

daniel:// stenberg://Mar 27

@gregkh @hanno I concur. We see lots of accurate finds reported with AI tools in curl as well.

Josh Bressers Mar 27

@bagder @gregkh @hanno

Probably add @sjvn to this

Finding vulnerabilities in code is something humans are bad at. There are a few that are good (like Hanno), but I would say in general it's not a common skill

So the bar for LLMs to find vulnerabilities is very very low

@joshbressers @bagder @gregkh @hanno I know only a handful of people who are good at spotting vulnerabilities, anything that can help the rest of us is a win in my book.

@bagder @gregkh @hanno same in systemd, quality of these reports has skyrocketed late last year. The current problem is that these LLMs cannot of course understand non-trivial security models, so they often report issues that are not really security issues, even though they are real bugs

@bluca @bagder @gregkh to that I'll just say that you have the same problem with humans. It's often easier to spot a bug than to decide whether it's a security bug. (And the latter is also, to some degree, a value judgement. Is something that isn't in itself exploitable but could become an attack vector in combination with other bugs a vulnerability? People keep fighting about such cornercases and whether they deserve CVEs, because nobody can clearly define what "a vulnerability" is.)

Rihards Olups Mar 27

@gregkh @hanno Would these also be surfaced by other vulnerability discovery tools, or are they all novel types that would slip past all those pre-existing tools and approaches?

Greg K-H Mar 27

@richlv @hanno Given that they have not been "found" yet by other tools that we know of, I would guess no. It's just "fuzzy pattern matching" which to be fair, is what LLMs are actually good at doing.

Rihards Olups Mar 27

@gregkh @hanno Yeah, I was wondering whether it's "nobody has bothered/financed before, but now that we have heavily subsidised models, why not, yolo!".
I guess, it boils down to "would this still happen if users had to bear real costs".

Frederik Braun �Mar 27

@hanno I am still very much grappling with a conclusion. But it's real. Linux appears to bee seeing the same https://www.theregister.com/2026/03/26/greg_kroahhartman_ai_kernel/

AI bug reports went from junk to legit overnight, says Linux kernel czar

Interview: Greg Kroah-Hartman can't explain the inflection point, but it's not slowing down or going away

The Register

Peter Stöckli Mar 27

@hanno Yes, you‘re spot on. And it‘s only going to accelerate.

Sebastian Schinzel Mar 27

@hanno it reminds me of the times when people were laughing at fuzzing. "What a waste of time and energy to throw random data at an app and expect it to find something useful!"

Then AFL came with a new insight: it could automatically measure progress and correlate that progress with a previous input.

This seems to a major ingredient for success, not only for vuln research: instrument an LLM to automatically determine progress and throw computing power at it.

See also:
https://sean.heelan.io/2026/01/18/on-the-coming-industrialisation-of-exploit-generation-with-llms/

On the Coming Industrialisation of Exploit Generation with LLMs

Recently I ran an experiment where I built agents on top of Opus 4.5 and GPT-5.2 and then challenged them to write exploits for a zeroday vulnerability in the QuickJS Javascript interpreter. I adde…

Sean Heelan's Blog

Benjamin Balder Bach Mar 27

@hanno Thanks for sharing! I was thinking the recent systemd AI controversy[1]1 illustrates how it's very difficult for folks to compare the real issues of AI with an isolated success story:

Pros: Finds security issues in software
Cons: Climate impact, water, raw materials, privacy, job loss, military applications etc

[1] https://github.com/systemd/systemd/issues/41085

Disallow usage of generative AI to write code · Issue #41085 · systemd/systemd

Component No response Is your feature request related to a problem? Please describe Generative AI is actively killing people, driving up costs, and plagiarizing work from many open source developer...

GitHub

Benjamin Balder Bach Mar 27

@hanno And as you say, it's not like we can't find security issues. We can! Unfortunately, now it's also an arms race.. so if you're not finding security holes pro-actively with AI, the perception is probably someone's finding them with the same AI - but maliciously?

So that makes it even harder to refuse?

@benjaoming @hanno always has been an arms race. the incentive has always been to throw as much computation at the problem as you can afford. but there are gaps and limits to what automation can accomplish. when an attacker writes tooling they have different goals and scope than a project maintainer. if you know some project has a massive testing suite they're running on oss-fuzz or like they have their own custom llm harness cranking away on cluster of h100s, you can't really compete with that. you will just focus on finding the gaps. so it's kind of an asymmetric battle, i think?

Charles Eckman Mar 27

@hanno I said to a friend recently- LLMs seem to be good *at* many things. That doesn't mean they are good *for* those things.