Mastodawn

I saw a wild take where someone said distributions are fascist for using systemd because systemd now uses Claude for code review.

okay. fine, I guess.

but if we are rejecting dependencies that use AI tooling, where do we go?

seriously. where do we go?

if the Linux kernel is using AI tools for codegen, then where do we go?

FreeBSD? I would put money on it that they use AI tools.

OpenBSD? NetBSD? HURD?

do we hard fork every dependency that is now tainted? do we even have the resources to do it?

FreeBSD and Illumos are the only ones reasonably close in the tech tree and I suspect both use AI tools too, as their development, like Linux, is driven by capital.

Show thread

Haelwenn /элвэн/

7h ago

@ariadne This one is all wack when like what 3~6 months ago there was a pro-systemd jerk being like "anti-systemd are all facists!"

Also yeah in terms of alternatives it's not great, so far I'm stuck with reducing as much as possible and planning to have more stuff like Plan9.
(Also pretty sure Hurd got LLM-tainted)

Show thread

sam 6h ago

@lanodan @ariadne re Hurd: I only saw one person doing some LLM review (not of submitted patches but they took it upon themselves to submit its findings), I don't consider that tainted and I don't think it's some sort of official effort or anything, even if I don't like it.

systemd embracing it with a CLAUDE.md, using it in all PRs, commits co-authored-by it etc is different.

Show thread

bluca 6h ago

@thesamesam @lanodan @ariadne

Hurd using LLMs for reviews: perfectly ok
systemd using LLMs for reviews: TAINTED

DId I get this right?

Show thread

sam

@bluca @lanodan @ariadne Someone deciding to send ML output a handful of times an ML is different from it being an established part of the project, sure.

(I also didn't say "perfectly ok", it's just that it's clearly different, even if one does or doesn't like it?)

Show thread

bluca 6h ago

@thesamesam @lanodan @ariadne gotcha, rules for thee but not for me

Show thread

sam 6h ago

@bluca @lanodan @ariadne If a contributor had copilot review their PR for systemd but systemd didn't have it as part of CI or as some regular part of contribution, I'd say the same thing.

But I'm not even making rules! I'm pointing out a distinction?

Show thread

Ariadne Conill 🐰

6h ago

@thesamesam @bluca @lanodan personally, i don't even think i *care* about LLM-based reviews.

what i care about is LLM-based code generation because every time i've interacted with people using those tools to produce changesets, it's been fucking miserable

Show thread

sam 6h ago

@ariadne @bluca @lanodan I've sort of come to this position as well, especially sympathising w/ what Lennart says about Bad Guys already using LLMs to find vulnerabilities, so may as well try to leverage them to do some good.

Don't love it still but I definitely feel warmer to it than the rest.

Show thread

Ariadne Conill 🐰

6h ago

@thesamesam @bluca @lanodan i guess to me, it feels unnatural and jarring to argue with a chatbot in a code review.

but that is far less harmful than dealing with changesets where the author does not even fucking know what he is submitting and cannot defend his work.

*that* is true misery as a maintainer.

Show thread

Ariadne Conill 🐰

6h ago

@thesamesam @bluca @lanodan basically the problem is AI as force multiplier for charlatanism.

claude making it miserable for charlatans to get their PRs merged actually seems like a positive use of the technology...

Show thread

bluca 6h ago

@ariadne @thesamesam @lanodan of course and stuff like that gets shot into the sun with a rocket without mercy.

But you don't argue with chatbots in reviews - these days claudebot is about 90% signal-to-noise ratio. The 10% noise you just dismiss, there's no arguing involved. But that 90% of signal has got really good in the past ~3 months, and there's no point denying it. This stuff was mostly crap until end of last year, but things change, and there's nothing wrong with changing views

Show thread

Ariadne Conill 🐰

6h ago

@bluca @thesamesam @lanodan oh yes, we have been experimenting with it at work for reviews.

it has indeed gotten pretty good.

but i hesitate becoming dependent on it as a FOSS maintainer because while the first hit is free, when the economic reality catches up... it will probably be quite expensive.

Show thread

bluca 6h ago

@ariadne @thesamesam @lanodan yeah that's obviously the end goal of all this wild and absurd speculation, but capitalism gotta capitalism. At some point the bubble will pop and then we'll see what's left standing

Show thread

aronowski 5h ago

@ariadne @thesamesam @bluca @lanodan The end-user should always be responsible for what they deliver, no matter the tools. Then any excuses like "AI wrote it" would not have any rights to defend the user.

Show thread

Ariadne Conill 🐰

5h ago

@aronowski @thesamesam @bluca @lanodan yes, that is basically the pkgconf contribution policy in a nutshell.

we have taken some steps to tell agentic tools to fuck off though, because i do not want to deal with it

Show thread

Haelwenn /элвэн/

6h ago

@thesamesam @ariadne @bluca Kind of still feels bad given how overblown a lot of security vulnerabilities are (I guess ICANN and registries will get more money from website-logo vulns), plus imagine getting a big wave of low-impact security vulnerabilities.

But well that's roughly the same issues as with fuzzers, except it's combined with codegen this time.

Show thread

sam 6h ago

@lanodan @bluca @ariadne Yes, exactly, it really is fuzzers all over again, just the problem is you now have this script-kiddy enabling tech on top.

Show thread

Ariadne Conill 🐰

6h ago

@thesamesam @lanodan @bluca yes, but script kiddies also figured out how to use the fuzzers and submit slop to us with "can you tell me about your bug bounty program?"

Show thread

Haelwenn /элвэн/

6h ago

@ariadne @thesamesam @bluca I think it's the kind of thing where I could end up replying "Here's my hourly rate for support requests"

Show thread

bluca 6h ago

@lanodan @ariadne @thesamesam our security bug bounty in systemd was 99.99% garbage until end of last year. Since then these tools have got way better, and I'd say there's a ~10% valid security bugs, ~70% valid bugs but not security relevant, and ~20% garbage. I'll happily take the 10% of real, valid issue found for the price of having to shoot down ~20% of garbage. The key is to have no mercy - there's no arguing or bargaining involved, a crap report gets binned, end of, no discussions

Show thread

bluca 6h ago

@lanodan @ariadne @thesamesam the 70% of valid-bugs-but-not-vulnerabilities is kinda 50-50 our fault and the bots fault. The bots fault because it's a dumb LLM in the end, it doesn't understand the big picture (well doesn't "understand", full stop). Our fault because a lot of the security models are pretty much implicit, and scarcely documented if at all, so the bot has nothing to keep it grounded to reality

Show thread

Ariadne Conill 🐰

6h ago

@bluca @lanodan @thesamesam yes, in our own experiments at work, we are having to write a lot into the system prompt in order to inform claude about the threat model.

otherwise it does silly things like "zones have device nodes in them that allow accessing hypervisor services"

well, yes.

i would hope so.

considering that it's running in a hypervisor, and you need those services to access secure enclaves, for example.

Show thread

sam 6h ago

@ariadne @lanodan @bluca yeah, and even before fuzzers with any sort of security tooling actually ("hello your CSP policy is missing on ur static website")

Show thread

bluca 6h ago

@thesamesam @lanodan @ariadne and I'm pointing out that the distinction is specious and a glaring case of double standards. Everyone uses who uses these tools does so in different ways, and you don't get to do moral grandstanding just because you arbitrarily drew a line in the sand where it's most convenient for you, and not a millimeter further. Doesn't work that way, sorry

Show thread

LisPi 53m ago

@thesamesam @bluca @lanodan @ariadne I'm fairly certain you're being baited.