So Anthropic employees are using Claude Code to contribute AI-generated code to open source repositories and hiding the fact using their own internal “undercover mode”.

Totally trustworthy people.

(Any open source project that at the very least requires disclosure of AI-authored contributions should immediately ban Anthropic employees on principle.)

#AI #Anthropic #ClaudeCode #subterfuge

@aral Honestly I don't actually hate this.

It's a tool. The _user_ is responsible for what they're submitting. It's putting code generated by them in their name. I think this is actually good.

@aredridel @aral I really can’t agree with this, because it’s a question of accurate labeling not of “responsibility” or “authorship”. co-authored-by is perhaps the wrong method for labeling such things, but consider raw milk. ultimately, it is indeed the producer’s responsibility to ensure their product is free of contamination. but disclosure of its method of production is explicitly the kind of requirement that allows consumers of said product to make safe choices

@glyph Yeah, I disagree. Code isn't ingredients and it's not “contamination" any more than you should label “I used search and replace on this”

What you want to know is whether it was well engineered or not.

And in fact, this is almost entirely orthogonal to "safety”. This is an engineering product. The safety comes from processes and whether or not _anyone checked the work done was right_, not the inputs.

@aredridel @glyph It is ingredients. It's not search-and-replace. It's literally incorporating parts of an unknown set of almost-surely-copyrighted works, without license or attribution, into the submission the person is misrepresenting as their own.

@aredridel @glyph What "AI coding tools" *should* be putting in commit messages is:

Co-Authored-By: An unknown and unknowable set of people who did not consent to their work being used this way and to which there is no license for inclusion.

@dalias Morally arguable but not actually true under the copyright regime that exists.

At what point does learning from others constitute their authorship?

@aredridel LLM slop is nothing like "learning from others".

But if you recall, we even took precautions against that. FOSS projects reimplementing proprietary things were careful to exclude anyone who might had read the proprietary source, disassembled proprietary code, worked at the companies who wrote or had access to that code, etc.

@dalias Yes. Do you know why?
@aredridel So that it would be abundantly clear, in any plausibly relevant jurisdiction, that the work was not derivative and not infringing.

@dalias @aredridel A test which LLMs fail by the very virtue of their functioning mechanisms.

It's all fundamentally derivative of the training dataset and it has been exposed both to AGPL and to proprietary datasets.

@lispi314 Has any legal authority weighed in on that claim yet?

@aredridel @lispi314 The facts of the matter are completely and utterly obvious.

Now, we live in a world where legal authorities are under complete capture by billionaires pushing this drug, so I am not going to make any predictions about how courts will rule. Even if they do rule in favor of these companies, those rulings will not be treated as precedents that benefit us.

And they will not be accepted by our communities.

What defines FOSS is not whether a court says it's non-infringing, but whether our communities agree that it was made respecting the intent and consent of the authors who licensed it.

@dalias Have you checked with the Free Software Foundation about that?

(Seriously, if it's a moral argument you're making, it's way stronger if you actually make it!)

Now "respect the intent of the author" is a fascinating concept and one worth examining!

@aredridel The FSF is a fan club for a sex pest, so no, I have not checked with them. I am speaking for the communities I would want to be a part of.

@dalias Right. You're appealing to a definition of "FOSS" that isn't entirely clear what it is. And the people who do usually have (some) claim to that authority, the common uses of it, are not the ones you're using.

I'm sympathetic to that but I can't tell what it is in an appeal to an unstated norm for a community that I can't quite identify.

@aredridel @dalias it's an existing community that's pretty well-defined as:

Everyone who believes that the *intent* of open-source licenses should be respected regardless of whether legal machinations actually enforce that.

It's super interesting to observe right now how that community is smaller than "people who say they're committed to FOSS" but the community is clearly at least a substantial subset of open-source contributors and maintainers, and regardless of what happens with the whole current "AI" debacle, we're mostly going to continue building human-authored code, giving it away for free (with an attribution requirement or more) and hoping that others will respect that simple requirement, and shaming/shunning those who flaunt it and brag about doing so (or in this case try as hard as they can to maliciously break the good citation practices and attributions that are part of the lifeblood of the community.

Some think this community will shrink and atrophy over time; others imagine it will be around cleaning up the mess after the AI bubble bursts. Whatever your expectation, saying "I think it's fine for Anthropic employees to actively undermine open-source attribution principles" tells everyone clearly that you're not interested in being part of the community that cares about those.

@tiotasram @dalias Yeah, that's never been at all unified. As long as I've participated in free software and free culture movements, there's always been a legalist side, an ideological side, and a community oriented side at least, plus the schism of 'permissive' vs 'copyleft'.

Never mind the corporate vs hacker aspects.

@aredridel @dalias

There's certainly a ton of different ideological approaches that contradict each other coming into play; IMO that makes for a healthy community from the anarchist perspective, and events like this where incompatible parts of it shuffle off are normal and acceptable. I'm going to vigorously oppose LLM-generated code and those who defend/promote it, the people in that camp who once believed themselves to be part of the "open source community" are going to have to recon with the fact sooner or later that the tools they promote are inimical to that community, and one side or the other will win the ideological battle over the term "open source" but the two camps won't be collaborating as freely any more given that one of them is actively preying upon the work of the other.

So far to me this schism seems much deeper than the permissive vs. copyleft debate (although to some extent it cuts along some of the same fault lines).

@tiotasram @aredridel @dalias @timnitGebru @emilymbender The revelations about Claude and its ecosystem this week are increasingly weighting me against Generative AI in FLOSS projects generally. Bruce Perens clearly saw the writing on the wall when he left the Open Source Initiative and founded PostOpen.org.