Pretty much EVERY book I've ever published got stolen by Meta and is listed in this database. That's over 30 books, a 25 year career output. (Need to find a UK class action lawsuit to join, or a US one that's open to non-US residents whose work was published in the USA).
https://retro.pizza/@digitalraven/114199906574357235
Everyday Cyborg (@[email protected])

Seven of the #RPG books I worked on were pirated by #Meta to train their #AI #Bullshit Regardless as to my feelings on copyright, the IP owner did not consent to their inclusion in the dataset. Meta's use is fundamentally immoral to the point that my own works will have an exclusion to their existing permissive licences to say "Fuck you and your idiot autocorrect" See if they've pirated your work here: https://www.theatlantic.com/technology/archive/2025/03/search-libgen-data-set/682094/

retro.pizza
@cstross that's a tough case, because they aren't copying, they're ingesting. The defense is that it's like borrowing a book and reading it, rather than taking it or making it available.
@quinn Well, apparently Sam Altman's AI-generated "short story" included a verbatim line from Nabokov, so I'm guessing if you asked for a book in the style of Charlie Stross the output would almost invariably include something I could sue them for over a breach of UK Fair Dealing copyright law. (Which TBF is a little more restrictive than US Fair Use law.)
@cstross that gets into a fascinating question, would getting a model to reproduce copyrighted works be the best way to construct their liability?
@quinn @cstross The claims in one of the OpenAI lawsuits (the NY Times one) include that some of their models, when suitably prompted, will regurgitate large chunks of NY Times articles verbatim. Examples start on page 30 of the complaint: https://nytco-assets.nytimes.com/2023/12/NYT_Complaint_Dec2023.pdf

@rst @cstross oh yeah, the verbatim thing is a tough hurdle, but i suspect they will point at the effect being rare. also there's a defense that if you come up with something copyrighted completely independently, the liability isn't the same, because it isn't a copy.

And I will say this again, I'm playing devil's advocate a bit to feel out how I think the law will work. Don't kill me 😂