• Claude code source "leaks" in a mapfile
  • people immediately use the code laundering machines to code launder the code laundering frontend
  • now many dubious open source-ish knockoffs in python and rust being derived directly from the source

What's anthropic going to do, sue them? Insist in court that LLM recreating copyrighted code is a violation of copyright???

@jonny do LLMs trained on gpl code have to be gpl? I don't know whether code-as-data is equivalent to code as executable, but I had honestly never considered that issue before.
@srvanderplas
They sure don't! Or at least if they did the entire industry would collapse overnight.
@jonny @srvanderplas
well, IANAL, but:
1) I have published GPLed code, and AFAI Understand, if the produced code is *linked* to the GPLed code/requires the GPLed code to run, to redistribute the new code it MUST be GPLed.
2) last I checked, the US court system was of the opinion that work produced by AI was NOT COPYRIGHTABLE. AFAIK, that should include any produced code. Other jurisdictions may have differing laws.
@traecer @jonny @srvanderplas I don't think your interpretation for 1) holds up. I should be able to distribute with any license I want (even a proprietary one) some code that theoretically depends on your GPLed code to be compiled, so long as I don't include actually distribute your code together with mine and I don't distribute the compiled program. It being 'required to run' does not trigger GPL by itself if it hasn't been run in the first place.

@traecer @jonny @srvanderplas I think this is unrelated to the question anyways. If AI-generated code is considered a derivative work of some GPLed code, then GPL does apply to it. No need to think about linking code or dependencies.

And as you say, courts seem to generally consider AI-generated code as public domain, so I would guess GPL is pretty much unenforcable in this context.

I am not a lawyer either though xD

@LaquinArt @traecer @jonny @srvanderplas You might find this interesting regarding copyright and AI generated code: (This isn't legal advice, watch and draw your own conclusions.) https://hachyderm.io/@ell1e/116313321022811490

@ell1e @traecer @jonny @srvanderplas The `isEven` example is really funny. xD

Yeah, I mean, if the AI-generated code is a blatant copy of some code in the training data, I don't think there will be much of a doubt that it's a copyright violation.

But when the generated code starts diverging from the source, I think it's a more legally gray area. I would also consider it a copyright violation, but it's not me who needs to be convinced about this. It's judges. And I haven't seen them agree yet.

@LaquinArt @traecer @jonny @srvanderplas The person in the video is a lawyer, just to let you know. Also there's this: https://www.twobirds.com/en/insights/2025/landmark-ruling-of-the-munich-regional-court-(gema-v-openai)-on-copyright-and-ai-training

My main intention was to give you resources that may inform you about how settled (or not settled) what you previously said really is. Not that I know though, since I'm not a lawyer. This isn't legal advice.

But there are plenty of sources saying that LLMs directly copying seems to be a regular event, not a rarity: https://dl.acm.org/doi/10.1145/3543507.3583199

Landmark ruling of the Munich Regional Court (GEMA v OpenAI) on copyright and AI training - Bird & Bird

@LaquinArt @jonny @srvanderplas
"I should be able to distribute with any license I want (even a proprietary one)"

Nope, that's the viral nature of the GPL. If you link to GPLed code and intend to distribute your new code, it MUST be GPL as well. That way helps to ensure the FSF's idea of "software freedom" aka "copyleft". This is why the Linux kernel license has an explicit exception to GPLv2 to ensure Linux syscalls can be made by user space code without distributing the user space code under the GPL. (See: https://www.kernel.org/doc/html/latest/process/license-rules.html) It's also one the reasons so many open/free source projects use dual licenses like Perl or Firefox, and more permissive licenses like the Apache 2.0 and MIT licenses are so popular.

What you have described is the Lesser GPL (LGPL) license, and yes, code under the LGPL does NOT require your code to have any particular license, provided you distribute any changes to the original (assuming you intend to distribute the original code at all).

Linux kernel licensing rules — The Linux Kernel documentation

@traecer @jonny @srvanderplas Did you read the syscall exception yourself?

‘This exception is used together with one of the above SPDX-Licenses to mark user space API (uapi) header files so they can be included into non GPL compliant user space application code.’

The exception allows you to *include GPL code*, which I also said triggers GPL.

The mere referencing does not trigger GPL so long as you don't include a GPL work. Otherwise, reimplementing APIs would be illegal. And we know it's not.

@traecer @jonny @srvanderplas Oh, look! The following paragraph makes this even more clear:

‘NOTE! This copyright does *not* cover user programs that use kernel services by normal system calls - this is merely considered normal use of the kernel, and does *not* fall under the heading of "derived work".’

@traecer @jonny @srvanderplas Upon further investigation, I see that the FSF considers that a work dynamically linking a GPLed work is covered by GPL. From what I know, this has never been proven in court, and I don't think it would hold up.

Sure, once you run the program, there exists a combined work that should be subject to GPL. But this combined work is generated by the user and never distributed, so GPL is never triggered.

@traecer @jonny @srvanderplas It's not really a question about GPL as much as about copyright law. I don't see how just linking a library dynamically constitutes a derivative work if not a part of the library is distributed. And if it's not a derivative work, the license doesn't even come into play. As the licensor, you don't get to decide what constitutes a derivative work. That's for a court to decide.