Unlikely that training AI constitutes copyright infringement where copyright law centers on the impact of a work’s reuse. #OpenAI’s #ChatGPT creates new works based on training data. If you read a murder mystery, then write a new murder mystery, you haven’t violated a copyright unless you have, to varying extent, copied parts of the book you read. Similarly, book and movie reviews are legal. Related to the web at large, scraping without attribution, will be destructive.
https://www.pcmag.com/news/sarah-silverman-sues-meta-openai-for-copyright-infringement
Sarah Silverman Sues Meta, OpenAI for Copyright Infringement

The comedian joins two authors in a lawsuit that claims AI was trained on their copyrighted books.

PCMag
Yes, I get that, none of this feels good. If #AI has made current copyright law wrong or incomplete, we should change the law. The tradeoff is, some countries will restrict training data available to an #LLM, and others won’t. Those with fewer restrictions, like Japan, will have smarter, better models. Given the transformative nature of AI, the legal restrictions an given country applies to AI training data will have significant economic (and maybe military) consequences.
@bretcarmichael I question whether authors have standing to make a copyright claim. They’ve assigned the copyright to their publishers. Expect publishers to take action eventually. I expect a replay of the Google Books lawsuit.
@kbiglione I didn’t consider that. While I’m no lawyer, your point makes sense to me.