Apparently chardet got Claude to rewrite the entire codebase from LGPL to MIT?

https://github.com/chardet/chardet/releases/tag/7.0.0

That is one way to launder GPL code I guess?

Release 7.0.0 · chardet/chardet

Ground-up, MIT-licensed rewrite of chardet. Same package name, same public API — drop-in replacement for chardet 5.x/6.x. Just way faster and more accurate! Highlights: MIT license (previous versi...

GitHub
@Foxboron lol right, because Claude certainly wasn't trained on GPL code

@scy
US court is leaning towards that LLM generated code is fundamentally not copyrightable.

This is a different problem to the moral issues I have with this.

@Foxboron But does "is not copyrightable" mean that "is not a license violation of its input data"? I highly doubt it.
@scy
A license violation usually implies that there is a copyright violation to begin with.

@Foxboron Yeah but that's what I mean: Just because the end result is not copyrightable, does that automatically mean that it can't be a copyright violation?

Like, changing the format or medium of something is not a copyrightable work.

So, by that logic, if I take a copyrighted MP3 and convert it to AAC and publish that, my AAC is not copyrightable, but it's not a copyright violation to take it and publish it?

That's what I mean.

@scy
I'm not a lawyer so I'm not going to try and debate what is and isn't a copyright violation.

@Foxboron @scy

This will have to go through a court case to settle it probably

But if I look at your source code, then I reproduce some of your source exactly, that's a problem

@joshbressers @scy

Supreme Court has already dismissed such cases.

https://www.cnbc.com/2026/03/02/us-supreme-court-declines-to-hear-dispute-over-copyrights-for-ai-generated-material.html

So we are getting a precedent in US law. Yet to be settled in any high court in the EU though.

@Foxboron @scy

I suspect this is different. That case someone trying to copyright something the AI spit out, not asking if AI can violate a copyright by copying something almost verbatim

Of course I haven't looked to see if the chardet code is mostly a copy, if it's not, then 🤷

@joshbressers @scy

Sure, but we are not really looking at, nor discussing, cases where LLMs spits out something verbatim from another project in this case.

@Foxboron @joshbressers @scy Open-source projects that have sought to be compatible with proprietary software, e.g. Samba trying to be compatible with Windows SMB, etc., have (if I'm not misremembering) taken a "clean room" approach and outright stated they do not want any code from any developer who had even looked at the MSFT code for fear of being accused of infringement.

The copyrightability of LLM output is not relevant here - the only question is whether a court would consider the original license infringed upon in the creation of the output.

As I understand it, though, this is a reimplementation of a codebase by the same contributors -- Dan Blanchard seems to be the primary maintainer before and after the rewrite, so ISTM he'd be able to relicense the project regardless of whether it was passed through an LLM first.

It will be interesting when this happens because a company or person decides "I don't like copyleft, so I'll just run this through the LLM wash until I get a functional copy". But this doesn't seem to be that.

@jzb @Foxboron @joshbressers Maintainers can't just change the license without asking each and every contributor for their approval. In open source projects, contributors usually keep their individual copyright, except when the project has them sign additional terms, or assign copyright to the project or something.

@scy @Foxboron @joshbressers I mean, they _can_ if they rewrite the code in question.

So here - *if* one of the LGPL code contributors is offended by the license change they could look at the new codebase and see if the new code resembles their contribution. Then they'd have to challenge it.

But projects have been relicensed without seeking permission from every contributor and/or by removing contributions if they cannot get approval. I'm not aware of any cases where a contributor has successfully challenged such - but there's always a first time.

@scy @jzb @joshbressers

Depends.

If you have a permissively licensed project, you can change the source to GPL by just using a poison pill approach.

This is what Forgejo did as an example.

https://forgejo.org/2024-08-gpl/

This works as the MIT license terms are met.

The other way would not work.

Forgejo is now copyleft, just like Git

@Foxboron @jzb @joshbressers You're right, I should've worded that differently.

They can change the license, if the current license allows it.

Still, everyone keeps their individual copyright.

@Foxboron @joshbressers @scy verbatim isn’t the question here, the question is infringement. is the output here substantially derivative of previous versions of chardet to the point that it could be considered infringing? US copyright precedent is a muddled mess and I think this could implicate at least one unresolved circuit split. I don’t know what the answer will be but I know I wouldn’t want to be standing in the blast radius of that decision
@Foxboron @joshbressers @scy Supreme court dismissed copyright case against generated material. Nobody discard case for infringement by this generated material.

You can't pursue somebody for reusing your AI material, because such material can't be copyrighted), but you can pursue somebody to have generated AI material from your copyrighted (and so not AI) material.
@Foxboron @scy you could still have an opinion. Discussing legal matters is not a subject to be discussed exclusively by legal professionals as it affects non professionals, too.

@muelli @scy
Sure.

But I'm not going to spend time on a strawman disguised as a logic puzzle. That isn't how laws work nor how they are formed.

@Foxboron @scy chances are high that LLM bros suspect it is, that's why they are cutting deals with Big Music. Unfortunately, there's no global-encompassing multi-billion dollar corporation protecting open-source...
@scy @Foxboron Using code to create a (highly) derivative thing off it without honoring the original license is pretty much the definition of a license violation.

@scy @Foxboron It is absolutely a violation for the company which built the model to build a model which emits license-restricted code without following the terms of the license. The model doesn’t commit the violation any more than a photocopier does, of course.

The emitted code cannot be copyrighted at all, but if it emitted the code in a way which meets the terms of the license, the code would be covered by the original license.

@scy @Foxboron It's a bit complicated, actually. IANAL, but this is what I understand:

- The music notation is copyrightable, individual notes are not. A sequence of notes is debatable, and it depends highly on recognizability AFAIK.

- A music recording is copyrightable. Playing that music in a distinctly different arrangement, less of an issue.

- Arguably, a change in digital format is either still the same recording, or sufficiently indistinguishable from it.

- Copyright has an ancient...

@scy @Foxboron ... naming and goes back to a time where making copies and distributing them was the hard part.

This is a non-problem in the digital age, which is why it's fine to create backup copies of copyrighted works, so long as the people accessing them are always the people having purchased/licensed an original copy.

So LLMs training on GPL is not itself a copyright violation, and them reproducing similar code isn't either, but then publishing such sufficiently similar code is.

@scy @Foxboron TL;DR what others already wrote: if the result is similar enough to inputs, the copyright holder of the inputs could challenge it, yes.
@scy @Foxboron If courts decide to throw this out, I would personally *love* for someone to use the exact same argument to produce a minimally altered copy of Avatar, and have Hollywood throw a fit.
@scy @Foxboron Basically, let's not fight this, let the industry giants fight each other. Throw a few near-copies of Metallica songs in for good measure, so we get a v2 of that "Napster baaaad" animation with greedy gnome Lars Ulrich.
@scy @Foxboron Either LLMs will die on the spot, or Copyright does.
@Foxboron @scy No. You can violate existing copyrighted material during creation of a not copyrightable material.
@Foxboron @scy hol' up... the *output* isn't copyrightable? That would be awesome if they decided that.
@thomasjwebb @Foxboron @scy They decide that. AI material is not human generated, so not copyrightable.
But it doesn't mean this material is not copyright infringement, the only dropped case concerned AI ppl trying to sue other AI ppl based on copyright, not at all real human pursuing AI material.
Currently NYT is on this way, and solid rock at this time :
https://www.nytimes.com/2025/12/05/technology/new-york-times-perplexity-ai-lawsuit.html
New York Times Sues A.I. Start-Up Perplexity Over Use of Copyrighted Work

Filed in federal court on Friday, the suit joins more than 40 other court disputes between copyright holders and A.I. companies.

The New York Times

@thomasjwebb Right now, that is how SCOTUS is leaning regarding AI generated output. They refused to interfere with a patent application and "artist" copyright, leaving it up to the copyright and patent offices to decide, which they said no. Some guy used AI to create a beverage holder and light beacon using AI. When the patent was denied, he tried to copyright the AI created "artist" renditions to get around the patent.

@Foxboron @scy

https://www.reuters.com/legal/government/us-supreme-court-declines-hear-dispute-over-copyrights-ai-generated-material-2026-03-02/


https://www.supremecourt.gov/docket/docketfiles/html/public/25-449.html

YUP

copyright is for humans, not automata ―hard or soft.

so, ironically, the prompts are copyrightable but not the output.

so anything you want to copyright should not be prompted into a corporate regurgitation machine, including so-called grammar checkers.

@thomasjwebb @Foxboron @scy

@thomasjwebb @Foxboron @scy In the US, at least, human authorship is required for copyright, and if you try to copyright something that's a mix of AI and human generated then generally only the human generated part is copyrightable.

https://www.congress.gov/crs-product/LSB10922#:~:text=Granting%20that%20human%20authors%20may,applying%20to%20register%20their%20copyright.

This is separate from the LLMs emitting text other people have written, so at *best* this code can't be licensed because it's not copyrightable, and at worst its license laundering and there's precedent (IIRC) for stomping on that hard.

Generative Artificial Intelligence and Copyright Law

@Foxboron @scy This means that anything "new" (i.e. nothing) the "AI" brought to the work is not a creative work that you can hold copyright to just because you were the person prompting/using the "AI".

It does NOT mean that the copyright on whatever the AI plagiarized is void. But that's how the industry will try to spin these rulings. We need to point out this distinction and fight their attempts to mislead in order to seize and enclose our work.

@Foxboron @[email protected] That'd be the US system. Then there's the various Euro systems that differ substantially. I'm certainly curious how this will turn out.

On the other hand: it'd require that those who can enforce their rights here actually do so.
Given that IP rights are normally enforced pretty harshly, even on consumers (anyone remember the days of the torrent c&d letters or the traditional find&ban the infringing exhibitor days on computex et al?) they're effectively completely ignored on FOSS.
There is virtually no education for biz, cs or law students on this topic, let alone mandatory ed.

Presenting the case of possibilities and rights to those who have them is often dismissed by those, especially developers on the younger side or those who are still in a "hobby" / "non commercial" stage. Only to shortly after complain about sustainability and demanding funding.

Instead we see demands to throw substantial amounts of tax money after random Foss projects on more or less random criteria and evaluators. Which will totally scale, right?

Virtually every company that was enforced against in terms of FOSS compliance ended up consciously allocating resources to FOSS in various ways. There are a lot of companies and they are a renewable resource in a functional economy.

But what do I know, rite? I just see the cases.
/rant

@Foxboron @scy So it's not copyrightable, what are they using to apply the MIT license if not their copyright‽ That makes no sense to me. (I'm reacting to you but to what you shared, to be clear.)
@Foxboron
Not a copyright fan, but us court just plain lazy
@scy
@Foxboron "not copyrightable" doesn't sound like "public domain", but more like "you cannot claim your copyright on it, and cannot slap a license on it". Is that what you mean?
@scy
This. It's driving me crazy that the foundational models cult just doesn't even try to talk about "can we prove what information has this _not_ been exposed to" which.. is like the first question everyone else starts with 😩
@Foxboron