Mastodawn

The AI hype-cyclone is bad, but so is the anti-AI witch hunt. Commits co-authored by Claude do not mean that a project has "abandoned engineering as a serious endeavor"

Would we say that accepting contributions from new developers means we've "abandoned engineering as a serious endeavor"? No.

Claude can write wrong code. New contributors can write wrong code. What matters is what you do with that code after it's been written.

Show thread

some kind of orange shape Feb 25

@nedbat Strictly speaking, that's true; and, I think the fact of Claude being attributed as the developer of some code speaks poorly of the effort of the developers involved.

Take the attrs library's AI policy, for instance:

Every contribution has to be backed by a human who unequivocally owns the copyright for all changes. No LLM bots in Co-authored-by:s.

That seems wise! If you're not so confident in the code that you'd type it with your own fingers, you're not confident enough to commit it.

I wish CPython would adopt that, too.

Show thread

Ned Batchelder Feb 25

@clayote "No LLM bots in Co-authored-by:s." Does this mean if Claude wrote some of the code you don't want it noted in the commit?

Show thread

some kind of orange shape Feb 25

@nedbat I want the author to take the fall, if it turns out to be uncopyrightable

Show thread

Ned Batchelder Feb 25

@clayote There's a person who is the author of the commit. I don't understand the requirement to not mention an LLM in a "Co-author" line.

Show thread

some kind of orange shape Feb 25

@nedbat It could be used to argue in court that the code isn't copyrightable, and therefore its license is unenforceable

Show thread

Ned Batchelder Feb 25

@clayote Seems like then the rule is, "You must not disclose that you did something to put the license at risk"? I feel like I'm still not understanding. I would understand if the intent of the rule was, "You must not use an LLM to create contributions."

Show thread

some kind of orange shape Feb 25

@nedbat The rule is that you have to own your code, legally speaking

I would hope that would result in people owning their code in a moral sense as well, but this isn't a code of ethics

Show thread

some kind of orange shape Feb 25

@nedbat It doesn't seem that different from having a policy against "classical" plagiarism. Yeah, a contributor might contrive to get plagiarized code into your codebase anyway; it's probably not even that hard. But it's good to require your contributors not to do that! Enforcement is just hard, that's all.

Show thread

some kind of orange shape Feb 25

@nedbat But the classical anti-plagiarism policy wouldn't forbid someone from reading code from somewhere else, nor even writing similar code

If Claude generates an "okayish" patch, and you rewrite it all by hand so it's actually good, no sweat

Show thread

some kind of orange shape Feb 25

@nedbat (i am actually gravely disappointed in my profession that we did not roundly reject these tools for reasons unrelated to law or code quality. but that's neither here nor there)

Show thread

Brett Cannon Feb 25

@clayote @nedbat The CLA requires you have the appropriate rights to code you contribute, so if using AI made that untrue then the contributor is the one who's on the hook for signing the CLA.

Show thread

Hynek Schlawack Feb 26

@nedbat @clayote that's impossible to enforce and I honestly don't want to. to each their own. the only things I can and will enforce are a) is the contribution good? and b) does it _look_ – technically and legally – as if the contribution is coming 100% from a human? I'm sure there's gonna be a ton a litigation around this stuff so I reckon it's better to be careful. I will sit out the culture war part of this.

https://en.wikipedia.org/wiki/Artificial_intelligence_and_copyright

Artificial intelligence and copyright - Wikipedia

Show thread

Paul Moore Feb 26

@hynek @nedbat @clayote That makes sense to me. To be honest, I don't understand why people include co-authored by an LLM. What are they implying by doing so? I assume the LLM adds it, and the author simply doesn't remove it. Which makes it feel like unwanted advertising on the part of the LLM company, to me...

Show thread

Hynek Schlawack Feb 26

@pfmoore @nedbat @clayote That’s exactly what it is and other problems aside I don’t see why my projects should be free billboards for them

Show thread

Brett Cannon Feb 26

@hynek @pfmoore @nedbat @clayote I think stripping out the co-authorship is a good starting point for having a policy.

Show thread

Alyssa Coghlan Feb 26

@brettcannon @hynek @pfmoore @nedbat @clayote If folks allow bots access to their PR branches, the co-author metadata gets added by the squash merge based on the commit history. It *doesn't* get added when folks use LLMs locally without making commits attributed to the bot. "Coauthor" git metadata often doesn't mean much (e.g. GitHub will set it if trivial PR suggestions are accepted), but is sometimes significant (e.g. CPython backport PRs are mostly bot-attributed, with human co-authors)

Show thread

Hynek Schlawack Feb 27

@ancoghlan @brettcannon @pfmoore @nedbat @clayote which kinda makes it a great brown M&M, doesn't it

Show thread

David Zaslavsky Feb 27

@ancoghlan @brettcannon @hynek @pfmoore @nedbat @clayote Claude Code does add that trailer when creating commits locally. (I dunno if that's what you meant)

Show thread

Brett Cannon Mar 2

@ancoghlan @hynek @pfmoore @nedbat @clayote You're right it gets added, but you can always edit the commit message in the squash merge to at least strip out the `Co-Authored-By` line.

Show thread

Paul Moore Mar 2

@brettcannon @ancoghlan @hynek @nedbat @clayote Only if your project does squash merges...

Show thread

saucoide Feb 26

@hynek @pfmoore @nedbat @clayote It is the new "Sent from my iPhone"

Show thread

Tamir Bahar Feb 27

@hynek @pfmoore @nedbat @clayote saw this post this morning and decided to write a pre-commit ad-blocker
https://github.com/tmr232/precommit-ad-blocker

Show thread

Patrick Gerken Feb 26

@nedbat @clayote I can imagine that some people will look closer at the ai contributed code when committing with their own name.

Show thread

Dmitry Tantsur Feb 28

@nedbat @clayote our projects make a distinction: humans go to co-authored-by, tools go to generated-by/assisted-by

Show thread

Dan (he/him)

Feb 25

@nedbat the key term here being "co-authored".

Show thread

Mark T. Tomczak Feb 25

@nedbat One thing I'll be interested to see is how developers respond to the inevitable rug-pull, because access to the models are currently being offered at unsustainable prices (and I mean that even in the context of "pushing electrons around is very cheap;" training the models and evaluating hundreds of thousands of queries per day is not too-cheap-to-meter). So when the inevitable price hike comes, some companies and users will choose to pay and some will fall off.

Of those who fall off... I wonder if any will roll their own replacements? Executing the model is cheaper than training it, and I don't think software patterns change so quickly that an infrequently-trained model would fall out of use. So it might only take one weights exfiltration for a lot of users to be able to spin up their own "good enough Claude" in a server farm...

Show thread

Laíns ⚡🌈🇵🇸Feb 25

@nedbat I think the criticism is more so from an ethical POV

Show thread

neoluddite Feb 25

@nedbat Why accept the AI companies' framing of an LLM as a person? What's next, vim as a co-author? coverage as a co-author?

Also, why use the phrase "witch hunt" for something which is very much not a witch hunt? There are all kinds of reasons to hate on LLMs.

Show thread

Ned Batchelder Feb 25

@neoluddite In particular, I was disappointed that someone decided that the entire CPython project had "abandoned engineering as a serious endeavor." Yes, AI has problems, but "Claude" in a commit message doesn't mean the whole project is out the window.

I don't consider AI a person, and I'm fine with leaving off the "Co-author" line.

Show thread

neoluddite Feb 25

@nedbat Given all of the other commits that I've seen where Claude and other LLMs (and the people running them) have no idea what they're doing, it's a pretty fair accusation.

Maybe Python has strong engineering in place, but given ALL of the issues with LLMs, the onus is very much on the people using it to demonstrate that it's a) done well, engineering-wise and b) has some major benefit that's unachievable in any other reasonable way.

Show thread

Ned Batchelder Feb 25

@neoluddite Yes, same as with new contributors.

Show thread

Funky Bob Feb 25

@neoluddite @nedbat OTOH, we've developed over the decades strong practices to detect and filter bad code, preventing it from being merged.

The big issues with LLMs is that those generating and submitting bad code now can do so with far less effort than it takes those who have to weed it out.

It's the user, not the tool.

Show thread

Ned Batchelder Feb 25

@FunkyBob @neoluddite There definitely are dynamics to watch out for. I trust the CPython core team to guard their time and take care with contributions.

Show thread

8 Feb 25

@nedbat consider the problem of a structural engineer testifying at inquest over a bridge failure. The choice to defer decisions to an unexaminable generative model is qualitatively different to decisions deferred to one that is fully explainable where all the inputs can be forensically examined.

I think there is a good argument that this is the line which marks the project as a "serious engineering endeavour" from a common understanding of what it means to be an engineer because it gets to the ethical component of the profession. If the methods are obscure to forensic audit then you may be an engineer but you're obviously not being serious - it's a repudiation risk.

The engineer must take authorship of code produced by tools and incurs the risk of misrepresenting authorship of poor quality code produced which could be little more than lossy-compressed copies of stolen copyrighted training data with error correction informed by custom prompts.

Can you work through these issues to your own satisfaction for a project according to a known risk profile? Sure. Can you pass it off as best practice? I don't think so. It's a matter of engineering ethics which determines whether it gets to the point where a coroner decides the answer to that question.

Show thread

Ned Batchelder Feb 26

@octarine_wiggle Every commit is owned by a person who has vouched for the quality of the commit, and every commit is reviewed by a person who has also vouched for it. Making a commit with Claude's help doesn't change that.

Show thread

8 Feb 26

@nedbat is the reviewer also to be assisted by Claude?

Show thread

Miguel Grinberg Feb 26

@nedbat Do you know why did these CPython core developers start using Claude Code? Aside from it being a good or bad idea, I think it would be useful to know the why.

Is it to save time? Or maybe to get through work that they find boring? Or perhaps Anthropic just gave free subscriptions to the core team as part of their donation? What is it?

Show thread

Miguel Grinberg Feb 26

@nedbat Knowing the reasons might help me understand and feel less disappointed. The optics without any context are bad, because I would imagine there's lots of people who would be willing to contribute to CPython if given the chance and if this trend continues they'll have less opportunities to do so.

Show thread

Cassandra is only carbon now Feb 26

@nedbat If this is in response to my toot about the CPython repo on GitHub, may I take the opportunity to respond?

If not, I apologize for interrupting.

Show thread

Ned Batchelder Feb 26

@xgranade You may always respond. The quote was from a reply to you.

Show thread

Cassandra is only carbon now Feb 26

@nedbat Appreciated, thank you. And yes, I realize that was a reply; I read that as coming from a place of frustration (a place I share, for what it's worth) rather than a literal statement.

That said, and to the direct point that you made, I don't think that calling out the use of AI products is a witch-hunt. AI is an effort to undermine labor and enclose common infrastructure, and I believe it is a fair thing to believe that there's a proactive duty to resist the adoption of AI products.

Show thread

Ned Batchelder Feb 26

@xgranade I took the statement as sincere, because how else would I know how to take it? There are many concerns about AI, many of which I share. It not helping address those concerns by making hyperbolic statements that discard the work of a highly qualified and dedicated core team. They aren't vibecoding.

Show thread

Cassandra is only carbon now Feb 26

@nedbat Something can be sincere without being literal? Frustration is a very valid emotion, to be sure, and watching tools that have *zero* appropriate uses being included in open source projects is a deeply frustrating thing to see.

With respect to the core team, I understand, which is why I was clear to point out that this is a systemic issue. The engineering standards set by Python cannot be completely upheld no matter how competent the core team is so long as PSF *policy* allows AI.

Show thread

Cassandra is only carbon now Feb 26

@nedbat Regardless, that's all why I got into replies and asked people to not pick on individuals, or even Python. My toot was, as I mentioned several times, by way of using Python as an example of a broader problem.

Show thread

Cassandra is only carbon now Feb 26

@nedbat For clarity, where I agree with the quote that you included is that I do not think that the use of AI is consistent with good engineering practice — to the extent that a project adopts AI, that is necessarily a compromise of engineering principles.

In the case of the the CPython interpreter, that seems to have been a fairly small number of well-isolated commit so far, but absent any mechanism to reject AI-generated code, I don't know how to uphold Python's engineering standards.

Show thread

Ned Batchelder Feb 26

@xgranade The important thing is to reject bad code. There are mechanisms for that. I've assumed that the concern was the possibility that AI code is bad code. New contributors can also contribute bad code. Yet welcoming contribution policies are not considered incompatible with "serious engineering."

Show thread

Cassandra is only carbon now Feb 26

@nedbat I agree that's *an* important thing, but I don't agree that it's the only important thing. Ethics matter, for instance, and on that basis alone, we have a strong moral imperative to reject AI products.

Setting that aside, though, even from the limited perspective of rejecting bad code, we similarly have a strong imperative to reject AI products.

I think the comparison to new contributors is somewhat misleading in trying to get at why.

Show thread

Cassandra is only carbon now Feb 26

@nedbat New contributors still have some understanding of the code they write, even if imperfect. They can be tutored and mentored into offering more valuable and useful contributions. They can be taught how a codebase works and grow to become maintainers.

AI products do none of that. LLMs do not "understand" anything, and cannot by construction do so. There is no process by which bad AI-extruded code can become good AI-extruded code, and so it's on us to reject it.

Show thread

Cassandra is only carbon now Feb 26

@nedbat I set aside ethics earlier, but I do think the ethical dimension is important here — to that end, and at the risk of oversimplifying, there are four main reasons to reject AI products on an ethical basis:

• They are founded on eugenicist philosophies. This is not hyperbole, but is a well-established fact.
• Financially, they largely benefit fascist movements.
• The environmental cost is untenable.
• AI products work by devaluing and exploiting labor.

Show thread

Cassandra is only carbon now Feb 26

@nedbat I think "serious engineering" demands a few things of us. We must use best available theory to understand our designs, we must act responsibly with respect to our professional communities, and we must act ethically with respect to broader society.

All of those duties, all of what taking software engineering as a literal engineering discipline demands, they are all completely incompatible with adoption of AI products.

Show thread

Cassandra is only carbon now Feb 26

@nedbat I recognize this is getting long, and I apologize for that — you raised something that has a lot of moving pieces, and it would be dishonest of me to respond in incomplete detail.

The last point I'll make, then, is with respect to "welcoming." We already have strong evidence that we cannot *and should not* be welcoming to all potential contributors, as per the Paradox of Tolerance. That's why we have codes of conduct.

AI products should be rejected following the same logic.

Show thread

Cassandra is only carbon now Feb 26

@nedbat I do not believe you can both be welcoming of laborers and AI products, given that AI products stand directly opposed to labor interests. I do not believe that you can both be welcoming of trans people and AI products, given how AI vendors act in global politics. I do not believe that you can be welcoming of younger people who disproportionately bear the costs of climate change and AI products, given their environmental impact.

Show thread

Jesus Michał "Le Sigh" 🏔 (he)Feb 26

@nedbat,

Would we say that accepting contributions from new developers using slave work means we've abandoned morality?

#NoLLM #NoAI