From the article:

Also, in 2022, several unidentified developers sued OpenAI and GitHub based on claims that the organizations used publicly posted programming code to train generative models in violation of software licensing terms

They can argue about it not being a copy all they want. If there is a single GPL licenced line of code scraped then anything they produce is a derivative work & must be licenced GPL.

nice.

I’ll play the uniformed devils advocate here:

  • Is the GPL license enforceable?
  • And if so, I assume “derivative” will still subjective to some degree. Where do we draw the line between derivative and non-derivative?
  • I’m torn about my personal opinion about copyrights and software licensing in general. I think the main problem is the huge power imbalance between people and corporations, not so much the fact a company analyzed a bunch of available data to solve programming problems.

    They don’t copy the data and sell it verbatim to others which would be a legal issue and in my mind also a moral issue, as they don’t add any additional value.

    1: yes

    2: Normally derivative works are patched or modified versions of the original. I think the common English meaning would apply & chatGPT et al are fucked. I doubt there is a precedent for this yet.

    Have there been any lawsuits involving breach of open source licences?

    Here in Stack Exchange we have mentioned some cases of lawsuits involving open source projects that allegedly breached patents and were sued. I'm curious, however, in the opposite case: has anyone ...

    Open Source Stack Exchange