I've just discovered that Meta illegally used one of my books ("I Can't Believe It's Not Buddha") to train its AI.

They are paraphrasing (uncredited) the contents of my book to make money.

I am not okay with this.

Sarah Silverman and others have started lawsuits. I hope some of the less rich among us can team up and do likewise.

h/t @petergleick

https://www.theatlantic.com/technology/archive/2023/09/books3-database-generative-ai-training-copyright-infringement/675363/?gift=QdM39RGtR94-pmclr4oVwi9lpuV63yswBnoweowTTIM&utm_source=copy-link&utm_medium=social&utm_campaign=share&s=03

These 183,000 Books Are Fueling the Biggest Fight in Publishing and Tech

Use our new search tool to see which authors have been used to train the machines.

The Atlantic
@bodhipaksa @petergleick Is it actually illegal? I don’t know if copyright extends to AI learning models, at this point?

@dubiago @petergleick Is taking a pirated copy of a book, feeding it into a computer, and then paraphrasing the contents of the book in order to make a profit illegal? Yes. Yes, it is.

Copyright law existed before LLM's were developed. New technologies aren't magically exempt from those laws just because they're new.

@bodhipaksa @petergleick Was it pirated? Or was it legally purchased?
@dubiago @petergleick It was pirated, along with many others, hence all the lawsuits OpenAI is facing.
@dubiago @petergleick At a certain point one has to lose patience and say, "RTFA."
@bodhipaksa @petergleick Yes. I did. There is a *claim* of piracy. But, we don’t know the source of these books. The article doesn’t specify—my guess is that the author doesn’t know. Ambiguity and assumption launched from that ambiguity is always fraught with danger…not that people with pitchforks care…

@dubiago @petergleick As the article says: "The data set, known as “Books3,” was based on a collection of pirated ebooks."

One of my books is in that collection, without my permission. No author whose work is in it gave permission for their book to be used in this way. Which is why there are now multiple lawsuits against OpenAI.

@bodhipaksa @petergleick Pirated according to who? Can the owner of the data set furnish legitimate purchase receipts? Was that question even asked?
@dubiago @petergleick Even if at some point the books were legitimately purchased, by being passed on to a third party they become pirated by definition.
@bodhipaksa @petergleick That’s for the courts to determine
@dubiago @petergleick Oh, boy. You are such a waste of time. Blocked.