• Claude code source "leaks" in a mapfile
  • people immediately use the code laundering machines to code launder the code laundering frontend
  • now many dubious open source-ish knockoffs in python and rust being derived directly from the source

What's anthropic going to do, sue them? Insist in court that LLM recreating copyrighted code is a violation of copyright???

@jonny do LLMs trained on gpl code have to be gpl? I don't know whether code-as-data is equivalent to code as executable, but I had honestly never considered that issue before.
@srvanderplas
They sure don't! Or at least if they did the entire industry would collapse overnight.
@jonny @srvanderplas
well, IANAL, but:
1) I have published GPLed code, and AFAI Understand, if the produced code is *linked* to the GPLed code/requires the GPLed code to run, to redistribute the new code it MUST be GPLed.
2) last I checked, the US court system was of the opinion that work produced by AI was NOT COPYRIGHTABLE. AFAIK, that should include any produced code. Other jurisdictions may have differing laws.
@traecer @jonny @srvanderplas I don't think your interpretation for 1) holds up. I should be able to distribute with any license I want (even a proprietary one) some code that theoretically depends on your GPLed code to be compiled, so long as I don't include actually distribute your code together with mine and I don't distribute the compiled program. It being 'required to run' does not trigger GPL by itself if it hasn't been run in the first place.

@traecer @jonny @srvanderplas I think this is unrelated to the question anyways. If AI-generated code is considered a derivative work of some GPLed code, then GPL does apply to it. No need to think about linking code or dependencies.

And as you say, courts seem to generally consider AI-generated code as public domain, so I would guess GPL is pretty much unenforcable in this context.

I am not a lawyer either though xD

@LaquinArt @traecer @jonny @srvanderplas You might find this interesting regarding copyright and AI generated code: (This isn't legal advice, watch and draw your own conclusions.) https://hachyderm.io/@ell1e/116313321022811490

@ell1e @traecer @jonny @srvanderplas The `isEven` example is really funny. xD

Yeah, I mean, if the AI-generated code is a blatant copy of some code in the training data, I don't think there will be much of a doubt that it's a copyright violation.

But when the generated code starts diverging from the source, I think it's a more legally gray area. I would also consider it a copyright violation, but it's not me who needs to be convinced about this. It's judges. And I haven't seen them agree yet.