If your Open Source project sees a steep increase in number of high quality security reports (mostly done with AI) right now (#curl, Linux kernel, glibc confirmed) please tell me the name of this project.

(I'd like to make a little list for my coming talk on this.)

Apache httpd, curl, Django, Firefox, glibc, GnuTLS, Haproxy, libssh, Linux kernel, python, Temporal, Wireshark, wolfSSL

More?

Updated:

Apache httpd, curl, Django, Elasticsearch Python client, Firefox, git, glibc, GnuTLS, Haproxy, Immich, libssh, Linux kernel, OpenLDAP, PowerDNS, python, Sequoia PGP, Temporal, urllib3, Wireshark, wolfSSL

We can say with certainty that this is widespread.

@bagder I'd be curious to see how many projects see a positive change, vs projects still suffering from slop reports. It would be interesting to have a larger sample over time, and see if there are some turning points that can be attributed to specific models or tools being released.
@bagder Det här gör mig intresserad! Var kommer du hålla den här presentationen?
Heap buffer overflow in TIFFClientOpenExt via TOCTOU race between strlen and strcpy on caller-supplied filename (#814) · Issues · libtiff / libtiff · GitLab

Summary A time-of-check-to-time-of-use (TOCTOU) race condition in TIFFClientOpenExt() (libtiff/tif_open.c) causes a heap buffer overflow when the name argument points to a shared mutable buffer that is concurrently...

GitLab
@EvenRouault @bagder Is this what’s meant with high-quality? Long inflationary description of a minor to practically non-existent vulnerability?

@bagder random anecdote tangentially related but I needed to debug a binary on Windows with no source. Claude used nothing but deno as a disassembler and found the exact issue (an async flag where it shouldn’t be and misuse of win32) which saved me hours waiting for the client to “maybe” give me the source.

Claude can be used very well for security work in the right hands.

@bagder

The next months I will call the-open source--security-apocalypse-dark-times (of death).

Because I wanted a cheerful name that makes it not seem as bad as it is. /s

@bagder

Should all responsible software be running security agents against their own software now as we do fuzzers/static analysis/tests/etc?

Or instead of being proactive do nothing? And hope it’s not just the nice people reporting the bugs that are finding the issues?

@renedudfield if you don't run AI powered code analyzers against your own code, your miss out a lot of bugs...
@bagder are you asking for negative reports as well?
@bagder Pretty sure if you ask the OpenSSL people directly they can also attest. @mold maybe?
@bagder every browser, every library that does media parsing, compression, …

@bagder OpenLDAP is seeing more AI-assisted bug reports that claim to be security issues, but aren't.

E.g., calling a crash in a commandline tool a DoS (no, it's not a service).

@hyc yeps, the tools still have a hard time to distinguish between bugs and security reports but at least they are nowadays often accurately identifying real flaws, even if not vulnerabilities

@bagder the other one we see is calling assert failures crashes. It's not a SEGV, there's no possibility of data exfiltration or RCE. There's no security exposure, it's just a bug. One that was anticipated hypothetically by the original developer, but whose final disposition wasn't decided upon way back when.

E.g. /* can this even happen? */

They toss in an assert, and it lives quietly in the code for decades before someone definitively shows yes, it can happen...

@hyc sure, but to me that goes into the gray area category where we always argue with reporters: what's a security problem and what is not. Debates done since the dawn of time. AI tools or not.
@hyc @bagder An assert failure controlled by data from a different privilege domain is a DoS/data loss vuln. The meaning of assert is documenting that you believe something can't happen under the intended usage.
@dalias it's a DoS but not the same as an actual crash, which is unanticipated. There is zero security exposure from an assert failure: no data leak, no unauthorized access, no possibility of code injection. The trigger conditions are clearly spelled out in the assert itself, so it's trivially remedied. Calling it a security issue dilutes the word "security" to meaninglessness.
@hyc Not all security issues are code execution. Generally this kind of issue is much lower-severity, but it can cause loss of unsaved data or corruption of existing data by leaving it in an inconsistent state at termination. CVEs are still assigned for DoS vulns.
@dalias in general, I suppose so. For OpenLDAP, we only consider something a security issue if it results in someone getting unauthorized access to data. Anything else is just an ordinary bug, and since our LMDB database guarantees ACID transactions, crashes can't leave data in an inconsistent state so that's a non-issue.
@bagder High-quality reports? Like, the information is real and helpful? I'd be interested in seeing this talk.
@bagder still in the spike of low quality ones…
@bagder it has not been my understanding that they are seeing a steep increase in number of **high quality** security reports...
@pettter we do

@bagder
I remember quite a few toots from you complaining about tons of very low quality "AI" generated vulnerability reports.

So... this situation has changed now?
Or did I misunderstand the irony in your toot from above?
@pettter

@bagder reverse question: do you/anyone know which tooling they are using to generate high quality reports and findings?
@fightbackman no, but my impression is that a lot of it is made with Claude code and various adaptations on top of that

@bagder @fightbackman So Carlini, the Anthropic guy, is not just a salesman? Serious question.

For me it's still open how much of a flood of reports should be expected.

@bagder @fightbackman Is there any indication to use open source ai models instead? I don't think dependencies against closed source software are good. Would those projects accept reports of these kinds of models? I imagine if we would take other development software there wouldn't be an argument about this.
@thaodan @fightbackman we're talking about reports created by tools. I'm sure most people would be happy if good reports where made with open source tools sure, but a report is a report, a bug is a bug. When someone reports a bug against my project, I care about fixing it. I don't complain about the tool used to find it.
@bagder @thaodan I think we can be happy to get reports with some actual value which don't burn out/waste the time of the maintainers.
This is what we can be happy about nowdays. If this in the end is a good or bad thing how these reports are generated is beyond my knowledge or wisdom to have a clear opinion. Time will tell.

@bagder

Just so I understand this correctly...
We don't want machine generated vulerability reports...

...so we can leave our #foss projects vulnerable to hackers who are not constrained by ideology in their sploits using #Ai ?

Yeah, that tracks with the current majority of #infosec "professionals" letting the Rome burn while they roast the marshmallows, feeling super pure and superior.

@n_dimension sorry, I don't understand what you're talking about.

@bagder

Your talk is going to be super fun.
Send me a link please!

@n_dimension @bagder The projects typically want security/bug reports, not computer generated words that *look* like security/bug reports.

Same reason you don’t want parrot operating your air traffic control tower radio. Do you want an air traffic controller or a parrot that sounds like an air traffic controller? Do you trust the parrot to safely direct planes according to aviation regulations?

@ClickyMcTicker @bagder

Even a broken clock is right twice day.

Lucky then the project maintainers don't have to be bothered by minutea of securing their projects with automation...

...because #blackhats certainty don't have the same reservations.

#Ai is a new attack surface and acting irrational and emotional towards it is incomprehensible

#infosec

@n_dimension @ClickyMcTicker @bagder a broken clock is not “right”, because its value as a timepiece is nonexistent, because there is no way of telling *when* it is right.

@RoganDawes @ClickyMcTicker @bagder

Time flows independent of the perception of the observer, therefore the timepiece is correct twice as time in the 24 period is linear and constant.

@n_dimension @ClickyMcTicker I’d argue that a clock that cannot be relied upon to provide a reasonably accurate time 99% of the time (modulo replacing a battery or similar) is useless 100% of the time.

@n_dimension @bagder let the attackers spend their time sifting through AI slop trying to find legitimate vulnerabilities. The defenders have a difficult enough time dealing with real, validated vulnerabilities.

If you want to spend your time proving us wrong, feel free to run your favorite LLM against a FOSS tool, then manually validate that what it spits out is legitimate by writing a proof-of-concept that is exploitable in the real world. If you do that, I’m sure the FOSS project of your choosing wil fix it.

The issue is that the flood of AI generated false positives are more than the all volunteer team of folks supporting a FOSS project can handle. aI is great at writing convincing slop. It has not demonstrated its ability to consistently find legitimate vulnerabilities.

If you disagree, prove us wrong… but you do the validation yourself. Don’t just spit out AI slop and make someone else do that work.

@mathaetaes @bagder

I'll just leave this here
https://news.ycombinator.com/item?id=47633855

...there is no "sifting" here

Claude Code Found a Linux Vulnerability Hidden for 23 Years | Hacker News

@n_dimension @bagder Tell that to the FOSS maintainers who receive hundreds of fully AI-generated "vulnerability" reports that all turn out to be false positives.

If you want to use AI to find a bug, go for it. Validate the bug. Write a proof-of-concept (or have AI do it if you're not capable) and test it yourself. If your proof-of-concept achieves the desired results, then submit the bug and the POC.

There are people just haphazardly feeding FOSS baselines into local AI and asking for bugs, then submitting whatever their LLM tells them without validating that it's correct. This effectively floods the maintainers with false positives and makes it very difficult for legitimate bug reports to get through.

Also, just because Claude found a bug doesn't mean it didn't also report 100 false positives before it found a real one. Given the effort it takes to triage a bug report, allowing any random yahoo with a keyboard to blindly submit AI-generated slop equates to enabling a DDoS on your bug triage staff. It's not sustainable.

@n_dimension @bagder Just so I understand this correctly...

you don't

@goedelchen @bagder

Then assplain it to me kid

@n_dimension @bagder first you change your tone.

Then please explain which part of "If your Open Source project sees a steep increase in number of high quality security reports (mostly done with AI) right now (#curl, Linux kernel, glibc confirmed) please tell me the name of this project. " you don't understand resp. where do you see something indicating not wanting machine generated reports.

@goedelchen @bagder

Are these legit sploits or noise?

@n_dimension @bagder I asked "where do you see something indicating not wanting machine generated reports"

Can you please answer that question.

Hardening Firefox with Anthropic’s Red Team  | The Mozilla Blog

For more than two decades, Firefox has been one of the most scrutinized and security-hardened codebases on the web. Open source means our code is visible,