Seen on LinkedIn “If an LLM wrote your code, you can't really claim that it's proprietary”
The floor is open for discussion.
Seen on LinkedIn “If an LLM wrote your code, you can't really claim that it's proprietary”
The floor is open for discussion.
@aalmiray There is literally already a case where the judge ruled exactly that.
The case is not about Source Code but from a legal pov it does not make a difference.
https://ifrro.org/resources/documents/General/German_Court_OpenAI_Memory_Output_Infringe_Copyright_NOV25.pdf
In short: GEMA was able to extract the full song texts from a few well known german songs from ChatGPT. And the court ruled that if this can be proven (which it was) then the prompt was NOT the intellectual effort which passed the threshold of originality. 1/2
@aalmiray 2/2
instead OpenAI infringed the copyright of the original musicians, who still own the copyright - including the parts you extraxt from OpenAI. If you apply all this to IT software then the authors who originally invented/wrote the source code in the training data are STILL the copyright holders. And the original licenses of those source code parts also still apply.
In other words: if you vibe code something, then YOU do NOT own the copyright nor can you define the license!
@struberg this is the thing I’m uncertain about, if the license of the ingested data used for training is transitive or not. Either answer brings a host of issues and opportunities, but we seem to be operating as it didn’t matter at all.
Once law and regulation are in place we’ll know how to properly handle this situation. Myself, for now, I block any AI contributions to my FLOSS projects.
@aalmiray
Right, it's a legal minefield. And an ethical - 'stealing' ideas is not nothing!
Assume a piece of Code is licensed as say GPLv2 and the LLM is trained on it. If said LLM down the line creates code in your projects which is even only similar to that code, then the generated code is legally also GPLv2. And due to the explosive virality of the GPLv2 all your other code might too.
In the end it doesn't matter if OpenAI's etc LLM took it from that other project or a human did, isn't?
@struberg I think it’s worse than just generating code that may closely resemble the inputs. Just ingesting code is enough.
IIRC the engineers working on J9 could only rely on the spec to create their own JVM implementation, and were not allowed to look into the OpenJDK impl for ideas/inspiration, as that could taint the result.
If this is how humans behave with code licenses, why are LLMs treated differently?
@struberg @nikolausf it doesn’t have to be trained with lots of GPL code. Just a single entry ingested is enough. That’s the virality of said license.
Anyhow, this show that we’re navigating with uncertainty and that’s dangerous.
@struberg even more so, yesterday I had a conversation with a developer that believed that becausee he paid for the GenAI tool then he could do what he wanted and that this licensing thing is not an issue 🤯
As if just because you paid money you could disregard the ToS of the tool/service.