Folks asked me about #chardet #LGPL situation. @corbet (on @lwn) published https://lwn.net/Articles/1061534/ it; everyone should read that.

Afterwards, take a đź‘€ at my comment on chardet's issue tracker:
https://github.com/chardet/chardet/issues/355#issuecomment-4145369025

TL;DR: I'm leading an effort at @conservancy to analyze this situation. The results will be published. It will take a long time — for good reason. Meanwhile, anyone using chardet commercially should call their lawyer.

#LGPLv2.1 #copyleft #SFC #SoftwareFreedom #LLM #AI #copyright

The relicensing of chardet

Chardet is a Python module that attempts to determine which character set was used to encode a [...]

LWN.net

@bkuhn @corbet @lwn @conservancy i'm very sorry (speaking to bradley) to interject in the following manner ; and i speak now to the lwn article here—which is one of the most snide and petty and unprofessional articles i've seen.

In 2026, though, that inability has, according to Blanchard, been overcome

it took me three separate times to parse out the meaning of this crucial detail which you left scattered across the ground, like a crime scene, on scraps of paper.

i suppose it's likely nonnegotiable lwn style, but blockquotes that fail to offset the name of the source from the flow of text are incredibly confusing to follow, particularly 3 in a row. it's particularly difficult for dyslexia.

Blanchard, unsurprisingly, disagreed.

the only part that makes this "unsurprising" is that you mentioned the same thing just above. if this was an attempt to mitigate the quote attribution issue, then it's hurting your productivity for your editor not to invest in 50 bytes of css.

Simon Willison has observed, though

that's a curious thing to observe, because you describe the practice right here in the plainest of terms:

A lot of people who are not lawyers have offered opinions

why on earth is willison qualified to speak on this matter? because he finally caught his breath after praising them as he does once an hour?

the rest of the paragraph with willison (whose authority is neither stated nor relevant) is a impressively bold free-for-all:

Beyond that, as others have pointed out, it is easy to ask an LLM to reimplement a body of code in a style different from the original,

that's a neat wikihow you've provided there, but you need a meme image

with the result that similarity checkers will see something entirely new.

oh? similarity checking? because that's not relevant for copyright—it can argue for infringement, but if you learn about this subject for more than half a moment, and give your readers a chance to think of their code like like they own it, you'd be describing the labor of the authors as what each of them owns, and the LLM a direct form of wage theft.

there is a field that is known for this similarity-checking! crypto is one of my favorites! in fact you published about it last week https://lwn.net/Articles/1012946 (i'm surprised this survived the weekend purge of several articles mentioning david howells, who backdoored linux-crypto and module signing in the kernel last weekend) it seems quite a travesty to only mention its theoretical use regarding copyright, and not how it's used all week to obscure the backdoors in the 7.0 release candidate.

That does not necessarily break the derived-work link, though.

I suppose when you and I envisioned "labor" it became something different. These "links" are "derived"—that's still crypto jargon? Would you say this carceral imagery of yours is an "opinion"? Or was the opinion mine, for noticing it?

Had an LLM been employed to translate chardet to, say, Lisp, the level of similarity would be quite low, but most would agree that the new code was derived from the original.

wilfully misleading readers regarding the all-powerful capabilities of the chatbot is also not an opinion, yet neither is it fact. i would denigrate this as sponcon but i don't think it's that.

The fact that the training corpus for Claude surely included all previous versions of chardet also muddies the picture.

oh? you've been allowed into the inner sanctum? they've shown it to you? the training data center that exists in 4 dimensions? that twists and warps and has 12 barycenters?

no one has ever mentioned the existence of some code or not. this again is basic nihilism and it's boring as fuck.

so that was a ridiculously political paragraph which started off with willison. the next line is about lawyers and opinions, which only lawyers can experience. the line after that—how on earth can you seriously do this? you break across a line and then lunge for this new question?

A lot of people who are not lawyers have offered opinions on whether chardet 7.0 is derived from previous versions.

Is that the trick for the 7.0 rc now too? Stochastic generation from previous versions? Because that would explain why all the code was so sloppy, and should never have passed any reviewer—linux-crypto not least, but certainly Linus. But I can't believe you'd simply gloat about it in public and give it away like some sort of comic book villain, so I assume this was merely a trick to cover all your articles with an excuse to type 7.0, so it's harder to search for the ones that made misleading or incorrect statements about it.

But it is worth saying that, if instructing an LLM to rewrite an existing body of code is sufficient to strip copyleft requirements from that code, then the future of copyleft looks even dimmer than it did before.

Compilers may perform an optimization known as constant coding. We can demonstrate it with your fabricated first-order logic statement here:

But it is worth saying that,

No, I don't think it was.

The death of copyleft could, ironically, be part of its real goal: the end of copyright.

please put down the blunt it doesn't make you smarter

Meanwhile, of course, had Blanchard simply shown up with a new Python module, let's call it "detectchar",

ok i no longer blame myself for failing to understand this piece. you don't ever once link to the much more well-known anthropic LLM issue, yet you repeatedly invoke it—again and again! now i understand how you "derived" from 7.0.0—I am actually shocked (but not really) that you'd have worked so hard to divert attention from an actual, literal, and extremely controversial claim!

would you blame my attention? another affliction? obfuscation is clearly the basic purpose of your mission

Hash-based module integrity checking

On January 20, Thomas WeiĂźschuh shared a new patch set implementing an alternate method for c [...]

LWN.net

@bkuhn @corbet @lwn @conservancy it is an incredible insult to bradley to write this absolute screed that spends the entire time advising readers on how to use an LLM to obscure, in the context of chardet 7.0.0 pending very heavily on relative contribution (is that not among the saddest of things? a shared joy together—a divorce?)

the author describes two separate times how to steal other people's labor with LLMs and get away with it (it doesn't work, but neither does the copyright advice). you mention ONCE that there is an actual legal determination (a specific decision that was made by blanchard) that your readers might care about. how can you as a journalist look at collaborators at odds and settle upon these numerical perspectives?

@bkuhn @corbet @lwn @conservancy the article also fails to mention the VERY SPECIFIC HISTORY regarding that one time when the government lab in virginia directed by bob kahn (which still owns gnu mailman) had a public spat with stallman?

That license is incompatible with the requirements for the Python standard library,

how on earth do you spend two paragraphs on names and acronyms assigning blame and avoid even mentioning the PSF license or cpython's LICENSE file

[a statement by stallman's lawyer isn't actionable as law, and in fact neither is it so for stallman himself, nor any man; we did actually have a whole revolution over that small detail]

cpython/LICENSE at a933e9ccee6d3c6753dbb23c38a9c576cc70c33c · python/cpython

The Python programming language. Contribute to python/cpython development by creating an account on GitHub.

GitHub

@bkuhn @corbet @lwn @conservancy and why did lwn delete their articles on david howells proposing to rewrite the kernel in c++ on sunday?

  • right as howells hard-reset force-pushed to remove any trace of his changes
    • (accepted by linus to the 7.0 rc last weekend)
  • which backdoor both module signing and linux-crypto separately
    • under the immensely unserious pretext of adding support for NIST's new post-quantum crypto?
@hipsterelectron
I thank you for being a fan of my work but it's not
@corbet's nor @lwn's job to write pro-copyleft propaganda. #LWN is one of the only tech publications that works hard to write from a neutral point of view. All journalism strives to dispassionately report; no human is perfect so it's an aspiration, not a certainity.
I disagree w/ a few things in Corbet's article, but if an article is good, everyone knowledgeable re: the situation should disagree w/ part of it.
@conservancy
@hipsterelectron @bkuhn @lwn @conservancy Wow. I must confess that I have almost no clue of what you are trying to communicate here. Other than you didn't like the article, that is.

Which articles do you claim we have purged? You should certainly be able to point to them in the Wayback Machine?