AI, a few thoughts, observations about AI & security vulns.
My standard line about AI is "there's a lot I'm uncertain about". But let's be clear, there's a lot I don't like & I'm probably biased towards the "here's how spectacularly AI failed once again" news (of which there are plenty) or at least the "it's not as impressive as it may look".
Yet, I don't want to close my eyes if I see things that clearly don't fit my biases. And I know a thing or two about security vulnerabilities.🧵
The most visible thing how AI impacted security vulnerabilities early on were slop reports. Famously, @bagder shared plenty of experiences with AI written garbage reports.
But there's another more recent development. Real, and valuable security reports show up. I heard those starting early this year. Those were single instances, but they were clearly showing that there are companies out there developing tools that spit out real vulnerabilities, with proof of concepts, and sometimes even patches.
Something else happened, and that was *very* recently. Those reports grew in numbers.
if I see 1-2 valid reports in a major open source lib from an AI tool, I'm not impressed. If I had enough funding, I could find valid vulns in a variety of ways.
When the Mozilla/Antropic thing came out, that was what I was thinking. "Yeah, these are real bugs, but you know, if I had infinite funding like Antropic, and a team of top security people, you know how many bugs I could find in Firefox?"

But that wasn't an isolated development either. It's clearly showing up everywhere. I'm running out of reasons not to think that AI tools got really good at finding security vulnerabilities.

Obvious caveat: None of that changes that there are plenty of good reasons to be very worried about the whole AI thing.

FWIW: I don't have a big conclusion here, I'm just sharing random thoughts and observations. /end thread
@hanno It is real, see my interview in the Register today about this very problem: https://www.theregister.com/2026/03/26/greg_kroahhartman_ai_kernel/
AI bug reports went from junk to legit overnight, says Linux kernel czar

Interview: Greg Kroah-Hartman can't explain the inflection point, but it's not slowing down or going away

The Register
@gregkh @hanno I concur. We see lots of accurate finds reported with AI tools in curl as well.

@bagder @gregkh @hanno

Probably add @sjvn to this

Finding vulnerabilities in code is something humans are bad at. There are a few that are good (like Hanno), but I would say in general it's not a common skill

So the bar for LLMs to find vulnerabilities is very very low

@joshbressers @bagder @gregkh @hanno I know only a handful of people who are good at spotting vulnerabilities, anything that can help the rest of us is a win in my book.
@bagder @gregkh @hanno same in systemd, quality of these reports has skyrocketed late last year. The current problem is that these LLMs cannot of course understand non-trivial security models, so they often report issues that are not really security issues, even though they are real bugs
@bluca @bagder @gregkh to that I'll just say that you have the same problem with humans. It's often easier to spot a bug than to decide whether it's a security bug. (And the latter is also, to some degree, a value judgement. Is something that isn't in itself exploitable but could become an attack vector in combination with other bugs a vulnerability? People keep fighting about such cornercases and whether they deserve CVEs, because nobody can clearly define what "a vulnerability" is.)
@gregkh @hanno Would these also be surfaced by other vulnerability discovery tools, or are they all novel types that would slip past all those pre-existing tools and approaches?
@richlv @hanno Given that they have not been "found" yet by other tools that we know of, I would guess no. It's just "fuzzy pattern matching" which to be fair, is what LLMs are actually good at doing.
@gregkh @hanno Yeah, I was wondering whether it's "nobody has bothered/financed before, but now that we have heavily subsidised models, why not, yolo!".
I guess, it boils down to "would this still happen if users had to bear real costs".