Meta's latest legal wheeze is to insist that pirating books is fair use, actually. And it might be working.

In order to help train its AI models, Meta (and others) have been using pirated versions of copyrighted books, without the consent of authors or publishers. The company behind Facebook and Instagram faces an ongoing class-action lawsuit brought by authors including Richard Kadrey, Sarah Silverman, and Christopher Golden, and one in which it has already scored a major (and surprising) victory: The Californian court concluded last year that using pirated books to train its Llama LLM did qualify as fair use.

You’d think this case would be as open-and-shut as it gets, but never underestimate an army of high-priced lawyers. Meta has now come up with the striking defense that uploading pirated books to strangers via BitTorrent qualifies as fair use. It further goes on to claim that this is double good, because it has helped establish the United States’ leading position in the AI field.

Meta further argues that every author involved in the class-action has admitted they are unaware of any Llama LLM output that directly reproduces content from their books. It says if the authors cannot provide evidence of such infringing output or damage to sales, then this lawsuit is not about protecting their books but arguing against the training process itself (which the court has ruled is fair use).

Judge Vince Chhabria now has to decide whether to allow this defense, a decision that will have consequences for not only this but many other AI lawsuits involving things like shadow libraries. The BitTorrent uploading and distribution claims are the last element of this particular lawsuit, which has been rumbling on for three years now, to be settled.

Meta's latest legal wheeze is to insist that pirating books is fair use, actually

Rather than, y'know, outright theft.

PC Gamer
Classic “the end justifies the means” (bad) defense. If ISPs can send letter for torrenting, and Facebook torrented a lot, Facebook deserves a fair punishment.
Not deserves, needs.

truck full of letters backs up to Meta’s headquarters

“there, that’s more appropriate.”

lol it would be hilarious if they could order Facebook disconnected from the Internet like a pleb hit with a copyright complaint
They didn’t say seeding isn’t fair use, just inherently part of torrenting. Good thing Sarah Silverman has pc gamer there to pander for her.
  • Shorter and more reasonable copyright lengths would make this a moot point because then there would sufficient literature in the public domain to pull from.

  • These kind of charges are what put the Pirate Bay admins in prison and caused Aaron Swartz to kill himself because of a threat of lifetime in prison. The claim that they did this either with the goal of profit or actually successful profit and that this was a serious crime. Neither TPB or Swartz at that point in time had ever moved as much data as Meta has for these claims, nor did they ever have the profit or possibility of profit Meta aims to make from their AI offerings.

  • Now Meta is claiming they’ve profited so hard you can’t possibly hold them accountable.

  • It will be the biggest “fuck you” in history to anyone ever hit with civil charges for piracy in the early 2000s, let alone the TPB admins and Swartz, if they let this go. Which means they probably will because in America, apparently if you crime hard enough and big enough they stop putting you in prison and start patting you on the back and calling it good business sense.

    It’s weird that your take away is “Meta needs to get it” and not “Clearly, these laws work for no one”. You don’t get better copyright laws by cheering for the copyright companies.

    Literally the first thing I said was in regards to more sensible copyright making this all a moot point but you do you.

    The only reason Meta needs to get it is because it’s entirely hypocritical to all the dirt poor people who couldn’t afford these kind of lawyers. It doesn’t make the current legal status right or correct. It’s just a slap in the face to someone like Swartz who died over far less.

    I would rather copyright be amended but sadly that’s less likely to happen here.

    We don’t get better laws if everyone is cheering for the copyright industry. Everything after your first point goes against that. Goliath, the same one that beat up Aaron, finally has a match in his own weight category, and you are hoping he wins.

    What kind of “better law” do you think will come out of this? That regular people like us will be able to share freely?

    You think that the law being applied on poor people but not on the wealthy is a healthy way to get a better law?

    Get the fuck real and nobody is asking for the copyright cabal to win as much as we are saying “look, if this is the how the law is going to be applied, apply it evenly, don’t just fuck over poor people but give the wealthy a pass.”

    And poor people who don’t have the weight and money of Meta aren’t going to be able to prove that they need the same amount of data to train an LLM so they probably will still have the law held against them. Get fucking real man.

    What country do you think you live in? One where laws are applied evenly or rationally? Or one where fascists have taken over the god damned government? Because guess what it’s the latter and the laws are effectively meaningless for the wealthy but still held against the poor. Sure, if that’s what you want, go for it, but it damn sure won’t suddenly get us better laws or let regular people torrent without worry. Congress has been deadlocked for decades and does nothing but hurt common people and give corporations a ticket to do whatever and you think better laws will come out of this? Seriously, once again, get fucking real.

    Encouraging laws you don’t like does nothing but cement them. We are currently, as a society, begging lawmakers for harder copyright laws.

    I get the Justice system sucks but making the wrong laws stronger does not make it better.

    Think about what you are saying is all, you tend to write long elaborate speeches on why copyright deserves to win. There is being critical AI, and then there’s being a mouthpiece. I’m not trying to be mean here, sorry.

    Dude, I have been promoting copyright law being changed and being shortened for 25 fucking years.

    Do you even know who Rufus Pollock is or anything about his research into copyright lengths? Because I was around when that shit was published. I hosted DJ Danger Mouse’s Grey Album on Grey Tuesday as a fuck you to the Beatles copyright holders since the Grey Album should have been considered fair use as it was released for free with no profit at all. I was part of the Kopimi collective.

    Not wanting corporations to get a pass while we all get fucked is not the same thing. You’re not being mean, you’re being obtuse.

    Rufus Pollock

    Sensemaking and pragmatism to build foundations of wiser societies.

    None of that matter if, right now, you are cheering on copyright laws.
    You repeating that I am cheering them on does not make it true. Get some reading comprehension. I repeat, you’re being obtuse.
    Do you hope Meta or the copyright industry wins this case? Maybe I misunderstood.

    There isn’t a good winner in this, both outcomes suck, but one slightly less than the other. If Meta wins, it will not trickle down to regular people’s usage of bittorrent being considered fair use, I can guarantee you that. If the copyright holders win, the outcomes still sucks, but at least large corporations will be held to the same standards as regular people instead of having another exception carved out for corporations to be able to do what is considered a crime for regular people.

    There isn’t a movement to change copyright like their used to be. There isn’t a viable North American Pirate Party. Those days are gone, and have been for a long time. I remember the movement and how big it was for a while. We never got mainstream acceptance or appeal and we all started getting old and young people stopped paying attention for the most part.

    Like I said, I’d rather copyright law be changed, but that’s not what will happen here. You don’t get new laws crafted out of court case wins and losses, that’s not how this works, laws are crafted in congress.

    Meta is running all this on the claim that they need this to train their AI, which is all fine and good, but them winning won’t make it so I can make the same claim if I get caught pirating. Why? Because the copyright lawyers will argue reasonably that I didn’t pirate enough data to build an AI and so I can’t be held to the same standard as Meta, who absolutely needed thousands of terabytes of data to train theirs. The scales are totally different and the scale of their operation is part of their argument, that because of the scale of their AI, that there’s no way they could conceivably train it without going broke paying copyright holders. If I am caught pirating a 1/10000th of the same data as they are, the copyright holders will claim, very easily, that I cannot possibly be building the same kind of AI that Meta is building because I would need way more data for that, and that I must be held to account because I must not be actually using it for AI. People like you and me can’t afford a team of high profile lawyers to argue our cases, and so we will lose, precedent simply won’t apply to us.

    Meta winning will just make it so there’s another avenue for corporations to do whatever the fuck they want while people like you and me still have to follow draconian absurd copyright laws. Laws are made in congress, and copyright length can only be changed by bills in congress becoming law. The outcome of this court case is bad either way but it is marginally less bad for people like us to at least have corporations held to the same standard we are.

    EDIT:

    Final note, even if copyright law does get changed in congress, it will be because groups like OpenAI and Meta will lobby the government to change it, and they will not lobby for regular people to get the same rights because they don’t want regular people building their own AIs. Like I said, both outcomes here suck ass, but these giant corporations are not and never will be fighting for people like you and I to have reasonable fair use laws. They will lobby for them to be able to do it, once again, based on their sheer scale, so nobody else can compete or make truly open products in their own home. They want ownership over the process, they won’t send lobbyists in to help regular people, they send lobbyists in to help themselves.

    This applies doubly so to Meta, if you know anything about Zuckerberg or the company, you’d know out of everyone he is ruthless and will do absolutely anything to crush nascent competition.

    This is why the left can’t get anywhere. You get one guy yelling “just follow the rules!” but he can’t be heard because you have another guy screaming “smash the state!”

    Thats my observation for the day.

    I read this as setting precedent that others couldn’t. Court cases like this are one way to make it possible for everyone to break an absurd law.
    Precedent only applies equally if we are able to prove the same in court as Meta did. Are you going to need petabytes of pirated data to train your AI? Can you afford a team of top quality lawyers to fight your case and prove you were training a small locally-hosted AI at home? Do you think Meta, of all companies, really is fighting for you to be able to do the same as them? You will still get taken to court, you will still have to fight your case, “precedent” isn’t an automatic get out of jail free card. Do you have the money to fight massive copyright holders with endless money? Of course you don’t, none of us do.

    And unlike Meta, you will be thrown in prison like Jeremiah Perkins.

    Even if found completely guilty, the worst that will happen is Meta has to pay a fine: which means nothing because any fine is rolled into the cost of doing business. Meta knows it is stupid to not break the law.

    That’s also precedent, and a template for using on institutions to break copyright. Still seems like good news to me.
    Precedent means we can cite it, so yes, this helps a bit. The rest you wrote is a fair bit of assumption or unnecessary: evidence to back your points would help. Otherwise, it just looks like inconclusive defeatism.

    Precedent is, in effect, new law and it absolutely does change who gets taken to court and the costs of defending your case. So, depending on which arguments the court accepts, I won’t need fancy lawyer. And it won’t require nearly the risk, creativity, or time that it requires of Meta’s legal reps today. Look at civil rights or environmental protections case law; big profile early cases were horrifically costly, and now compliance by company’s is largely by default.

    Horrible people and companies can set good precedent, often without intending to. For example, plenty of criminals set and clarified due process law. So we absolutely could all benefit from Meta’s bad intentions.

    We benefit from institutions that will be training their own AI, hosting data publicly, and have the resources to mirror a precedent. Care to cite sources that the arguments being accepted are going to carve out Mark Zuckerberg by name as the one person who can ignore copyright? I haven’t read the fillings, but this should be easy.

    in America, apparently if you crime hard enough and big enough they stop putting you in prison and start patting you on the back and calling it good business sense.

    There’s a story about Alexander the great capturing a pirate and scolding him for raiding villages along the coast line. Alexander asked if the pirate feels ashamed and wants to beg for forgiveness. However, the pirate had something else to say. He said that Alexander was doing the same thing, but infinitely worse. The only difference was that Alexander called himself king and plundered entire lands while the pirate only raided small villages. The pirate reminded Alexander of the many lives he had destroyed in his conquest. So the pirate’s only crime was not to be the biggest baddie in the hood, so to speak.

    Alexander replied by stating that the title of king forces his hand and that he couldn’t just stop what he was doing. The pirate on the other hand was just an individual who could easily change course. And so Alexander set the pirate free, stating that he himself will start changing his own ways right there and then if the pirate makes a fresh start first.

    I don’t know if there is any truth to this but it’s a fable often used to explain how legitimacy changes the perception people have of wrong doing and heroism on a fundamental level. Alexander’s reply sounds like an excuse and I think that’s on purpose. The pirate outwitted him in the end by stating a basic truth.

    www.youtube.com/watch?v=UQBWGo7pef8

    This is where I first remember hearing this tale, in this old Schoolhouse Rock parody that was in protest of the War in Iraq.

    Pirates and Emperors - Schoolhouse Rock Parody

    YouTube

    in America, apparently if you crime hard enough and big enough they stop putting you in prison and start patting you on the back and calling it good business sense.

    If you owe the bank $100 you have a problem. If you owe the bank $100,000,000, the bank has a problem.

    If heaven and hell are real I hope God and Satan give Swartz a sabbatical so he can go torture Zuck for a while, periodically.
    I like the implication that both have to sign off on it

    If the ruling goes the wrong way, like with many cases like this (drug use is a good example), it won’t help those in the past. However, it will open a door for everyone in the future.

    My guess is every DCMA entity on the planet has already sent this judge a letter saying that allowing this defense is a terrible idea. I am honestly torn on this one since there are so many unknowns, and if Meta loses it will mostly be publishers that benefit vs authors.

    These kind of charges are what put the Pirate Bay admins in prison and caused Aaron Swartz to kill himself because of a threat of lifetime in prison.

    Um, he did not kill himself

    We’re going to end up in a situation where whatever is necessary to train AI is permitted, and the main question is whether that will be through (re)interpretation of existing law or the passage of a new law.
    Good thing I have a local model running that’s constantly learning, for precisely this reason
    I’m still collecting media before I can start the training process.
    If anything, this is proof you should be next in line for a large venture capital infusion!

    Arguing that training models isn’t fair use us going to be a massive uphill battle, it’s basically reading the book but with a computer. It’s not actually a big deal to people, unless you hold the copyright to a ton of works and want to get a percentage of all the AI income these companies have made.

    Torrenting the books is likely absolutely copyright infringement, but that has relatively low payout compared to the money these companies are getting for their models. The training being fair use means that rights holders can’t try to take any money from the model’s use. The statutory limits for infringement even at per work levels aren’t significant compared to the legal cost of proving it happened.

    Anthropic pirating books for their training corpus resulted in the biggest copyright settlement in history–well over a billion. That is still being quibbled over i believe, but they settled because they were likely to pay out more if the case went forward. So I’m not really sure where you’re coming from that infringement via torrenting does not result in monstrously large liability.
    The judge in that case ruled the training wasn’t fair use for pirated books, which left them on the hook for potentially all revenue (likely a court determined percentage) that the model generated for them in addition to statutory damages. That is well north of 1.5 billion.
    Which is kind of a pity. Anyone who’s ever written something on the net should be getting royalty checks from these fucks. I’m not exactly famous but I’ve written prolifically in my field of work and have gotten nearly word-for-word reproductions of my articles out of every big model I’ve tested since GPT-3.

    Just noticed your reply and want to correct this. Anyhropic settled, the 1.5bil was not a judgment against them. Specifically, this covered the literal pirating of the training corpus. It had absolutely nothing to do with the way training on the data handled the training data–they literally torrented an enormous portion of their training corpus.

    Anthropoc DID try to argue that because they used the pirated material for training a model, it was fair use. The judge correctly decided that doesn’t make any fucking sense. Again, this is not about the models encoding data, it is literally just about the fact that these silly fucks torrented vast portions of their training corpus like college students building a porn library on college broadband.

    Is it fair use if I do it?
    How rich are you?
    I’m quite poor. I’m thankful every day that my mom and dad still let me live with them.
    I wouldn’t recommend it then 😞
    just claim that you are training an AI for a new startup you are working on, and will soon be looking for VC to fund the project further. be sure to use terms like “revolutionary” and “democratize” liberally.
    sure. thanks meta, anna’s archive will help me with my reading list, thanks.
    We can train our NI (Natural Intelligence) models.
    To demand shrubberies?

    anna’s archive

    I wish. As someone astutely put it in another conversation, now that the tech companies have pilfered Anna’s Archive, the big publishers are going to try to get it shut down.

    Major Publishers Sue Anna's Archive Over 'Staggering' Copyright Infringement, Seek Injunction * TorrentFreak

    Publishers, including Penguin Random House, Elsevier, and HarperCollins, filed a new lawsuit against the shadow library Anna’s Archive.

    we back it up and do it all over again, rinse repeat.
    Yup, that’s what I’m doing with all those audiobooks I torrented. Helping the US maintain the lead in AI 😂
    Unironically may become a legitimate defense. Although in that case, indiscriminately bombing gas stations in your town and extorting their owners should also be allowed but…
    So I can use pirated media to train my AI (Actual Intelligence), right?
    Should make all journal publications fair use.
    Unfortunately you do have to prove you’re intelligent

    As long as you’re rich enough to hire your own army of lawyers, probably.

    That said, it seems like when you’re rich enough to hire your own army of lawyers you can pretty much do whatever you want.

    Well, that doesn’t sound civil or lawful at all and more like kindoms of the dark ages degree of “rules” where it doesn’t apply to a choosen few.

    If Meta and other bigcorps that support the US goverment get the special “avoid-judgment” card and you face punishment then there’s no law, only bigotry.

    That would encourage individuals and small groups to keep their activites a secret (go anonymous) and break the law whenever they can,
    because the “king and his followers” don’t follow their own “rules”.

    The US is not only getting dystopian, they’re commiting primitive mistakes.

    If only US were going for a win in that AI
    Yes, in fact there’s no framework or legal precedent right now so everyone already is doing it. You can just scrape the web etc and disregard IP ownership because training AI is transformative work - as it should be.
    Looking forward to Jellyfin getting a LLM to train locally on movie preferences so everyone’s library is fair use. Wait, is this why LLMs are being shoehorned into everything?