Microsoft removes guide on how to train LLMs on pirated Harry Potter books
The now-deleted Harry Potter data set was "mistakenly" marked public domain.
https://arstechnica.com/tech-policy/2026/02/microsoft-removes-guide-on-how-to-train-llms-on-pirated-harry-potter-books/?utm_brand=arstechnica&utm_social-type=owned&utm_source=mastodon&utm_medium=social
@arstechnica Wake up Honey. "Harry Potter and the Magically Disappearing TERF Codex" just dropped. 🙄
@arstechnica mistakenly … mistakenly. I see.
@arstechnica ...soooo... Microsoft has tripped on its own d!ck again, and Just Kidding Rowling may not be able to continue printing money from her thinly-veiled rewrite of the Arthurian legend but probably not really?