Pretty much EVERY book I've ever published got stolen by Meta and is listed in this database. That's over 30 books, a 25 year career output. (Need to find a UK class action lawsuit to join, or a US one that's open to non-US residents whose work was published in the USA).
https://retro.pizza/@digitalraven/114199906574357235
Everyday Cyborg (@[email protected])

Seven of the #RPG books I worked on were pirated by #Meta to train their #AI #Bullshit Regardless as to my feelings on copyright, the IP owner did not consent to their inclusion in the dataset. Meta's use is fundamentally immoral to the point that my own works will have an exclusion to their existing permissive licences to say "Fuck you and your idiot autocorrect" See if they've pirated your work here: https://www.theatlantic.com/technology/archive/2025/03/search-libgen-data-set/682094/

retro.pizza
@cstross that's a tough case, because they aren't copying, they're ingesting. The defense is that it's like borrowing a book and reading it, rather than taking it or making it available.
@quinn Well, apparently Sam Altman's AI-generated "short story" included a verbatim line from Nabokov, so I'm guessing if you asked for a book in the style of Charlie Stross the output would almost invariably include something I could sue them for over a breach of UK Fair Dealing copyright law. (Which TBF is a little more restrictive than US Fair Use law.)
@cstross that gets into a fascinating question, would getting a model to reproduce copyrighted works be the best way to construct their liability?
@quinn That's, in my opinion, easy enough: base it on PLR (Public Lending Right). The government pays into a pot that is distributed to creators on the basis of how many loans are made by libraries (in PLR). For AI corps, it'd the companies paying, and PLR would disburse funds to creators based on how frequently the works in question are used to LLM generative output. Lots of fine details to hammer out, but there's an actual existing framework to model the solution on.

@cstross That's a really interesting approach. I wonder how you would construct how much latitude you need for something to considered sui generis.

defining a threshold for sui generis seems like a big deal in this kind of law.