This case shows how Open Source will die. With anyone just being able to pipe existing code and tests through an LLM and claiming that to be "clean room" (which is hogwash) no licensing can protect your work from being accumulated and monetized by anyone. The commons are actively being shredded in front of our eyes.

https://github.com/chardet/chardet/pull/322

chardet 7.0: ground-up MIT-licensed rewrite by dan-blanchard · Pull Request #322 · chardet/chardet

Summary This PR is for a ground-up, MIT-licensed rewrite of chardet. It maintains API compatibility with chardet 5.x and 6.x, but with 27x improvements to detection speed, and highly accurate suppo...

GitHub
@tante I have heard people talking about this but what I don't understand is what license the code was under before the "rewrite". The project's own self-description says "chardet 7.0 is a ground-up, MIT-licensed rewrite of chardet" which sounds like something a LLM would write but doesn't tell me much (And the PR is big enough to break Github's PR display feature so it's a little hard to figure out what the project looked like before the +14526 -546715 patch.)
@mcc Before the Claude "clean room" reimplementation chardet was licensed under LGPL.
So the dude used claude (which was probably trained on chardet/LGPL) to generate a new version of chardet with the same API etc but put it under MIT license.

@tante Thank you. And do I understand correctly that administratively the MIT rewrite* is "the same library", e.g. the maintainer flattened their own repository and hosted the new* thing at the same github address, same pypi address, same readthedocs address? I hadn't used Chardet previously and search engines point only to this same project.

* "so to speak"

@mcc yes. They jumped from 6.0.0 (LGPL) with a huge merge request that changed everything to the Claude generated version and relicensed it. So it is in the same tree. Same name/canonical URL and everything. So the "clean room" argument is at least softened by putting it in direct succession

@tante Okay. It's clear now, Tthanks.

(Presumably of course the single maintainer did not write all of that code and anyone who contributed under LGPL would have expected consent before relicensing.)

@mcc yes, there are contributors who contributed under LGPL especially Mark Pilgrim who started the whole project under that license.
@tante @mcc does the fact that AI content can't be copyrighted impact this at all?
@SomeVeganCheeseIsOk @mcc "AI content can't be copyrighted" is a bit of an oversimplification. TBH for the matter at hand it is relevant only in the sense that Claude surely was trained on chardet so it supports the "derivative" argument a bit.