Mastodawn

Cory Doctorow May 13, 2024

The same is true for wire-service copy or widely syndicated texts: there might be dozens or even hundreds of copies of these works in training data, resulting in the memorization of long passages from them.

This *might* be infringing (we're getting into some gnarly, unprecedented territory here), but again, even if it is, it wouldn't be a big hardship for model makers to post-process their models by comparing them to the training set, deleting any inadvertent memorizations.

13/

Show thread

Cory Doctorow May 13, 2024

Even if the resulting model had *zero* memorizations, this would do nothing to alleviate the (legitimate) concerns of creative workers about the creation and use of these models.

So here's the first nuance in the AI art debate: as a *technical* matter, training a model isn't a copyright infringement. Creative workers who hope that they can use copyright to prevent AI from changing the creative labor market are likely to be very disappointed in court:

https://www.hollywoodreporter.com/business/business-news/sarah-silverman-lawsuit-ai-meta-1235669403/

14/

Sarah Silverman Hits Stumbling Block in AI Copyright Infringement Lawsuit Against Meta

The ruling builds upon findings from another federal judge overseeing a lawsuit against AI art generators, who similarly delivered a blow to fundamental contentions from plaintiffs in the case.

The Hollywood Reporter

Show thread

Cory Doctorow May 13, 2024

But copyright law isn't a fixed, eternal entity. We write new copyright laws all the time. If *current* copyright law doesn't prevent the creation of models, what about a *future* copyright law?

Well, sure, that's a possibility. The first thing to consider is the possible collateral damage of such a law. The legal space for scraping enables a wide range of scholarly, archival, organizational and critical purposes.

15/

Show thread

Cory Doctorow May 13, 2024

We'd have to be *very* careful not to inadvertently ban, say, the scraping of a politician's campaign website, lest we enable liars to run for office and renege on their promises, while they insist that they never made those promises in the first place. We wouldn't want to abolish search engines, or stop creators from scraping their own work off sites that are going away or changing their terms of service.

16/

Show thread

Cory Doctorow May 13, 2024

Now, onto quantitative analysis: counting words and measuring pixels are *not* activities that you should need permission to perform, with or without a computer, even if the person whose words or pixels you're counting doesn't want you to. You should be able to look as hard as you want at the pixels in Kate Middleton's family photos, or track the rise and fall of the Oxford comma, and you shouldn't need anyone's permission to do so.

17/

Show thread

Cory Doctorow May 13, 2024

Finally, there's publishing the model. There are plenty of published mathematical analyses of large corpuses that are useful and unobjectionable. I love me a good Google n-gram:

https://books.google.com/ngrams/graph?content=fantods%2C+heebie-jeebies&year_start=1800&year_end=2019&corpus=en-2019&smoothing=3

And large language models fill all kinds of important niches, like the Human Rights Data Analysis Group's LLM-based work helping the Innocence Project New Orleans' extract data from wrongful conviction case files:

https://hrdag.org/tech-notes/large-language-models-IPNO.html

18/

Show thread

Cory Doctorow May 13, 2024

So that's nuance number two: if we decide to make a new copyright law, we'll need to be *very* sure that we don't accidentally crush these beneficial activities that don't undermine artistic labor markets.

This brings me to the most important point: *passing a new copyright law that requires permission to train an AI won't help creative workers get paid or protect our jobs*.

19/

Show thread

Cory Doctorow May 13, 2024

Getty Images pays photographers the *least* it can get away with. Publishers contracts have transformed by inches into miles-long, ghastly rights grabs that take *everything* from writers, but *still* shifts legal risks onto them:

https://pluralistic.net/2022/06/19/reasonable-agreement/

Publishers like the *New York Times* bitterly oppose their writers' unions:

https://actionnetwork.org/letters/new-york-times-stop-union-busting

20/

Reasonable Agreement – Pluralistic: Daily links from Cory Doctorow

Show thread

Cory Doctorow May 13, 2024

These large corporations already control the copyrights to *gigantic* amounts of training data, and they have means, motive and opportunity to license these works for training a model in order to pay us less, and they are engaged in this activity *right now*:

https://www.nytimes.com/2023/12/22/technology/apple-ai-news-publishers.html

21/

Apple Explores A.I. Deals With News Publishers

The company has discussed multiyear deals worth at least $50 million to train its generative A.I. systems on publishers’ news articles.

The New York Times

Show thread

Cory Doctorow May 13, 2024

Big games studios are *already* acting as though there was a copyright in training data, and requiring their voice actors to begin every recording session with words to the effect of, "I hereby grant permission to train an AI with my voice" and if you don't like it, you can hit the bricks:

https://www.vice.com/en/article/5d37za/voice-actors-sign-away-rights-to-artificial-intelligence

22/

‘Disrespectful to the Craft:’ Actors Say They’re Being Asked to Sign Away Their Voice to AI

Motherboard spoke to multiple voice actors and advocacy organizations, some of which said contracts including language around synthetic voices are now very prevalent.

Show thread

Cory Doctorow May 13, 2024

If you're a creative worker hoping to pay your bills, it doesn't matter whether your wages are eroded by a model produced without paying your employer for the right to do so, or whether your employer got to double dip by selling your work to an AI company to train a model, and then used that model to fire you or erode your wages:

https://pluralistic.net/2023/02/09/ai-monkeys-paw/#bullied-schoolkids

23/

Pluralistic: Copyright won’t solve creators’ Generative AI problem (09 Feb 2023) – Pluralistic: Daily links from Cory Doctorow

Show thread

Cory Doctorow May 13, 2024

Individual creative workers rarely have any bargaining leverage over the corporations that license our copyrights. That's why copyright's 40-year expansion (in duration, scope, statutory damages) has resulted in larger, more profitable entertainment companies, and lower payments - in real terms and as a share of the income generated by their work - for creative workers.

24/

Show thread

Cory Doctorow May 13, 2024

As Rebecca Giblin and I write in our book *Chokepoint Capitalism*, giving creative workers more rights to bargain with against giant corporations that control access to our audiences is like giving your bullied schoolkid extra lunch money - it's just a roundabout way of transferring that money to the bullies:

https://pluralistic.net/2022/08/21/what-is-chokepoint-capitalism/

25/

What is Chokepoint Capitalism? – Pluralistic: Daily links from Cory Doctorow

Show thread

Cory Doctorow May 13, 2024

There's an historical precedent for this struggle - the fight over music sampling. 40 years ago, it wasn't clear whether sampling required a copyright license, and early hip-hop artists took samples without permission, the way a horn player might drop a couple bars of a well-known song into a solo.

26/

Show thread

Cory Doctorow May 13, 2024

Many artists were rightfully furious over this. The "heritage acts" (the music industry's euphemism for "Black people") who were most sampled had been given *very* bad deals and had seen very little of the fortunes generated by their creative labor. Many of them were desperately poor, despite having made millions for their labels. When other musicians started making money off that work, they got mad.

27/

Show thread

Cory Doctorow May 13, 2024

In the decades that followed, the system for sampling changed, partly through court cases and partly through the commercial terms set by the Big Three labels: Sony, Warner and Universal, who control 70% of all music recordings. Today, you generally can't sample without signing up to one of the Big Three (they are reluctant to deal with indies), and that means taking their standard deal, which is *very* bad, and also signs away your right to control your samples.

28/

Show thread

Cory Doctorow May 13, 2024

So a musician who wants to sample has to sign the bad terms offered by a Big Three label, and then hand $500 out of their advance to one of those Big Three labels for the sample license. That $500 typically doesn't go to another artist - it goes to the label, who share it around their executives and investors. This is a system that makes every artist poorer.

29/

Show thread

Cory Doctorow May 13, 2024

But it gets worse. Putting a price on samples changes the kind of music that can be economically viable. If you wanted to clear all the samples on an album like Public Enemy's "It Takes a Nation of Millions To Hold Us Back," or the Beastie Boys' "Paul's Boutique," you'd have to sell every CD for $150, just to break even:

https://memex.craphound.com/2011/07/08/creative-license-how-the-hell-did-sampling-get-so-screwed-up-and-what-the-hell-do-we-do-about-it/

30/

Creative License: how the hell did sampling get so screwed up and what the hell do we do about it? – Cory Doctorow's MEMEX

Show thread

Cory Doctorow May 13, 2024

Sampling licenses don't just make every artist financially worse off, they also prevent the creation of music of the sort that millions of people enjoy. But it gets even *worse*. Some older, sample-heavy music *can't* be cleared. Most of De La Soul's catalog wasn't available for *15 years*, and even though some of their seminal music came back in March 2022, the band's frontman Trugoy the Dove didn't live to see it - he died in February 2022:

https://www.vulture.com/2023/02/de-la-soul-trugoy-the-dove-dead-at-54.html

31/

Show thread

Cory Doctorow May 13, 2024

This is the third nuance: even if we can craft a model-banning copyright system that doesn't catch a lot of dolphins in its tuna net, it could *still* make artists poorer off.

Back when sampling started, it wasn't clear whether it would ever be considered artistically important. Early sampling was crude and experimental.

32/

Show thread

Cory Doctorow May 13, 2024

Musicians who trained for years to master an instrument were dismissive of the idea that clicking a mouse was "making music." Today, most of us don't question the idea that sampling can produce meaningful art - even musicians who believe in licensing samples.

Having lived through that era, I'm prepared to believe that maybe I'll look back on AI "art" and say, "damn, I can't believe I never thought that could be real art."

But I wouldn't give odds on it.

33/

Show thread

Cory Doctorow May 13, 2024

I don't like AI art. I find it anodyne, boring. As @henryfarrell writes, it's *uncanny*, and not in a good way:

https://www.programmablemutter.com/p/large-language-models-are-uncanny

Farrell likens the work produced by AIs to the movement of a Ouija board's planchette, something that "seems to have a life of its own, even though its motion is a collective side-effect of the motions of the people whose fingers lightly rest on top of it."

34/

Large Language Models are Uncanny

Like capitalism, LLMs are haunted: voids that seem to speak

Programmable Mutter

Show thread

Cory Doctorow May 13, 2024

This is "spooky-action-at-a-close-up," transforming "collective inputs ... into apparently quite specific outputs that are not the intended creation of any conscious mind."

Look, art is irrational in the sense that it speaks to us at some non-rational, or sub-rational level. Caring about the tribulations of imaginary people or being fascinated by pictures of things that don't exist (or that aren't even recognizable) doesn't make any *sense*.

35/

Show thread

Cory Doctorow May 13, 2024

There's a way in which all art is like an optical illusion for our cognition, an imaginary thing that captures us the way a real thing might.

But art is *amazing*. Making art and experiencing art makes us feel big, numinous, irreducible emotions. Making art keeps me sane. Experiencing art is a precondition for all the joy in my life.

36/

Show thread

Cory Doctorow May 13, 2024

Having spent most of my life as a working artist, I've come to the conclusion that the reason for this is that art transmits an approximation of some big, numinous irreducible emotion from an artist's mind to our own. That's it: that's why art is amazing.

AI doesn't have a mind. It doesn't have an intention. The aesthetic choices made by AI aren't choices, they're averages.

37/

Show thread

Cory Doctorow May 13, 2024

As Farrell writes, "LLM art sometimes seems to communicate a message, as art does, but it is unclear where that message comes from, or what it means. *If it has any meaning at all, it is a meaning that does not stem from organizing intention*" (emphasis mine).

Farrell cites Mark Fisher's *The Weird and the Eerie*, which defines "weird" in easy to understand terms ("that which does not belong") but really grapples with "eerie."

38/

Show thread

Cory Doctorow May 13, 2024

For Fisher, eeriness is "when there is something present where there should be nothing, or is there is nothing present when there should be something." AI art produces the seeming of intention without intending anything. It appears to be an agent, but it has no agency. It's *eerie*.

Fisher talks about *capitalism* as eerie. Capital is "conjured out of nothing" but "exerts more influence than any allegedly substantial entity."

39/

Show thread

Cory Doctorow May 13, 2024

The "invisible hand" shapes our lives more than any person. The invisible hand is *fucking eerie*. Capitalism is a system in which insubstantial non-things - corporations - appear to act with intention, often at odds with the intentions of the human beings carrying out those actions.

So will AI art ever be art? I don't know. There's a long tradition of using random or irrational or impersonal inputs as the starting point for human acts of artistic creativity.

40/

Show thread

Cory Doctorow May 13, 2024

Think of divination:

https://pluralistic.net/2022/07/31/divination/

Or Brian Eno's Oblique Strategies:

http://stoney.sb.org/eno/oblique.html

I love making my little collages for this blog, though I wouldn't call them important art. Nevertheless, piecing together bits of other peoples' work can make fantastic, important work of historical note:

https://www.johnheartfield.com/John-Heartfield-Exhibition/john-heartfield-art/famous-anti-fascist-art/heartfield-posters-aiz

41/

Divination – Pluralistic: Daily links from Cory Doctorow

Show thread

Cory Doctorow May 13, 2024

Even though painstakingly cutting out tiny elements from others' images can be a meditative and educational experience, I don't think that using tiny scissors or the lasso tool is what defines the "art" in collage. If you can automate some of this process, it could still be art.

42/

Show thread

Cory Doctorow May 13, 2024

Here's what I *do* know. Creating an individual bargainable copyright over training will *not* improve the material conditions of artists' lives - all it will do is change the relative shares of the value we create, shifting some of that value from tech companies that hate us and want us to starve to entertainment companies that hate us and want us to starve.

43/

Show thread

Cory Doctorow May 13, 2024

As an artist, I'm against anything that stands in the way of making art. As an artistic worker, I'm committed to things that help workers get a fair share of the money their work creates, feed their families and pay their rent.

I think today's AI art is bad, and I think tomorrow's AI art will *probably* be bad, but even if you disagree (with either proposition), I hope you'll agree that we should be focused on making sure art is legal to make and that artists get paid for it.

44/

Show thread

Cory Doctorow May 13, 2024

Just because copyright won't fix the creative labor market, it doesn't follow that *nothing* will. If we're worried about labor issues, we can look to *labor law* to improve our conditions. That's what the Hollywood writers did, in their groundbreaking 2023 strike:

https://pluralistic.net/2023/10/01/how-the-writers-guild-sunk-ais-ship/

Now, the writers had an advantage: they are able to engage in "sectoral bargaining," where a union bargains with *all* the major employers at once.

45/

How the Writers Guild sunk AI’s ship – Pluralistic: Daily links from Cory Doctorow

Show thread

Cory Doctorow May 13, 2024

That's illegal in nearly every other kind of labor market. But if we're willing to entertain the possibility of getting a new copyright law passed (that won't make artists better off), why not the possibility of passing a new *labor* law (that will)? Sure, our bosses won't lobby alongside of us for more labor protection, the way they would for more copyright (think for a moment about what that says about who benefits from copyright versus labor law expansion).

46/

Show thread

Cory Doctorow May 13, 2024

But *all* workers benefit from expanded labor protection. Rather than going to Congress alongside our bosses from the studios and labels and publishers to demand more copyright, we could go to Congress alongside every kind of worker, from fast-food cashiers to publishing assistants to truck drivers to demand the right to sectoral bargaining. That's a hell of a coalition.

47/

Show thread

Cory Doctorow May 13, 2024

And if we *do* want to tinker with copyright to change the way training works, let's look at *collective* licensing, which can't be bargained away, rather than individual rights that can be confiscated at the entrance to our publisher, label or studio's offices. These collective licenses have been a *huge* success in protecting creative workers:

https://pluralistic.net/2023/02/26/united-we-stand/

48/

United We Stand – Pluralistic: Daily links from Cory Doctorow

Show thread

Cory Doctorow May 13, 2024

Then there's copyright's wildest wild card: The US Copyright Office has repeatedly stated that works made by AIs aren't eligible for copyright, which is the exclusive purview of works of human authorship. This has been affirmed by courts:

https://pluralistic.net/2023/08/20/everything-made-by-an-ai-is-in-the-public-domain/

Neither AI companies nor entertainment companies will pay creative workers if they don't have to.

49/

Everything Made By an AI Is In the Public Domain – Pluralistic: Daily links from Cory Doctorow

Show thread

Cory Doctorow May 13, 2024

But for any company contemplating selling an AI-generated work, the fact that it is born in the public domain presents a substantial hurdle, because anyone else is free to take that work and sell it or give it away.

Whether or not AI "art" will ever be good art isn't what our bosses are thinking about when they pay for AI licenses.

50/

Show thread

Cory Doctorow May 13, 2024

Rather, they are calculating that they have so much market power that they can sell whatever slop the AI makes, and pay less for the AI license than they would make for a human artist's work. As is the case in every industry, AI can't do an artist's job, but an AI salesman can convince an artist's boss to fire the creative worker and replace them with AI:

https://pluralistic.net/2024/01/29/pay-no-attention/#to-the-little-man-behind-the-curtain

51/

Pluralistic: I assure you, an AI didn’t write a terrible “George Carlin” routine (29 Jan 2024) – Pluralistic: Daily links from Cory Doctorow

Show thread

Cory Doctorow May 13, 2024

They don't care if it's slop - they just care about their bottom line. A studio executive who cancels a widely anticipated film prior to its release to get a tax-credit isn't thinking about artistic integrity. They care about one thing: money. The fact that AI works can be freely copied, sold or given away may not mean much to a creative worker who actually makes their own art, but I assure you, it's the *only* thing that matters to our bosses.

52/

Show thread

Cory Doctorow

Tomorrow (May 14), I'm on a livecast about AI and enshittification with @timoreilly:

https://www.oreilly.com/live-events/tim-oreilly-and-cory-doctorow-on-enshittification-and-the-future-of-ai/0642572001651/

Wednesday (May 15), I'm in North Hollywood for a screening of Stephanie Kelton's *Finding the Money*:

https://www.laemmle.com/film/finding-money?date=2024-05-15

Friday (May 17), I'm in San Francisco at the @internetarchive to keynote the tenth anniversary of the @AuthorsAlliance:

https://www.authorsalliance.org/2024/03/15/authors-alliance-10th-anniversary-event-authorship-in-an-age-of-monopoly-and-moral-panics/

eof/

Tim O’Reilly and Cory Doctorow on “Enshittification” and the Future of AI

Join Cory Doctorow and Tim O’Reilly for a discussion about “enshittification” and the future of AI. Cory’s notion of enshittification describes how online platforms win over millions of users by providing great service and delighting their users, ...