Mastodawn

Yesterday Cory Doctorow argued that refusal to use LLMs was mere "neoliberal purity culture". I think his argument is a strawman, doesn't align with his own actions and delegitimizes important political actions we need to make in order to build a better cyberphysical world.

EDIT: Diskussions under this are fine, but I do not want this to turn into an ad hominem attack to Cory. Be fucking respectful

https://tante.cc/2026/02/20/acting-ethical-in-an-imperfect-world/

Acting ethically in an imperfect world

Life is complicated. Regardless of what your beliefs or politics or ethics are, the way that we set up our society and economy will often force you to act against them: You might not want to fly somewhere but your employer will not accept another mode of transportation, you want to eat vegan but are […]

Smashing Frames

I really like and admire @pluralistic and have utmost respect for him, and that's why I'm totally baffled about why he is claiming "fruit of the poisoned tree" arguments as cause of LLM scepticism.

The objections to LLMs aren't about origins but about what they they are doing right now: destroying the planet, stealing labour, giving power over knowledge to LLM owners etc.

The objections are nothing to do with LLMs' origins, they're entirely about LLMs' effects in the here and now.

Ian Betteridge Feb 20

@FediThing @tante @pluralistic Some people - in fact quite a lot; if my reading is correct - do indeed argue that LLMs can *never* be ethically used because they are “trained on stolen work”.

Cory Doctorow Feb 20

@ianbetteridge @FediThing @tante

Performing mathematical analysis on large corpora of published work is not "stealing."

James Gleick Feb 20

@pluralistic @ianbetteridge @FediThing @tante “Mathematical analysis” is doing a lot of work here. It could mean gathering meaningless statistics. Or it could mean capturing the qualities (deviations from the average) that make a particular work of art (or author) special, creative, surprising—for use in simulacra.

I think that's harmful, to the culture as a whole, if not to the artworks and artists getting regurgitated.

@gleick @ianbetteridge @FediThing @tante

Let's stipulate to that (I don't agree, as it happens, but that's OK). It's still not a copyright infringement to enumerate and analyze the elements of a copyrighted work.

For the record, I think AI art is bad and neither consume nor make it.

James Gleick Feb 20

@pluralistic @ianbetteridge @FediThing @tante I'm not claiming that's copyright infringement. Even if one respects the general framework of copyright, which I know you don’t, it seems hopeless to apply it to this AI mess.

But there is a kind of theft here. Not that it's actionable or measurable. But it’s nontrivial. It's related to questions of impersonation. It's an assault on individuality. Whatever your reasons for thinking AI art is bad (I have some sense), it's related to that, too.

James Gleick Feb 20

@pluralistic @ianbetteridge @FediThing @tante Some authors have taken the view that they deserve some compensation for the use of their books in training the LLMs. Do the transaction and we're hunky-dory. That's not my view. I don't care about compensation; I just don't want my prose regurgitated in the LLMs, for reasons I'm not yet able to express properly. I feel I should have been asked, and I feel violated.

Dave Rahardja Feb 20

@gleick @pluralistic @ianbetteridge @FediThing @tante I think the sense of “theft” that creators feel is directly caused by the fact that the AI industry (as it stands today) is a Ponzi scheme which is fundamentally built on remixing creators’ works and devaluing human labor. I have a feeling that most creators will not feel the same kind of outrage if an educational institution created the same technology for academic use, e.g. to generate insights into online culture and psychology.

In short, the GRIFT (i.e. the particular application of the technology) is the source of the feeling of theft, not the technology itself. I think the tech itself has value when used ethically.

FWIW I agree with Cory here that copyright is the *wrong* framework to use for criticizing AI, because for every case where copyright helps the individual creator, there are hundreds of cases where it helps incumbent megacorporations more.

https://www.humancode.us/2024/05/15/copyright-ai.html

Copyright will not save us from AI

humancode.us

Martijn Vos Feb 20

@drahardja @pluralistic @tante @FediThing @gleick @ianbetteridge

I think there's a couple of aspects to the "theft":
* the theft of material: they're trained on copyrighted material
* the theft of jobs: AI is being used to replace artists/writers/coders; it's the same thing that upset the Luddites
* the theft of style: not only does AI "learn" from the works of others, it can emulate it. On demand. Some artists have very unique, personal styles that are suddenly not their own anymore.

Ian Betteridge Feb 20

@mcv @pluralistic @drahardja @tante @FediThing @gleick I mean, *I* was trained on copyrighted material. So were you. So is everyone. I even regurgitate phrases I’ve read, usually unknowingly.

Martijn Vos Feb 20

@ianbetteridge @pluralistic @drahardja @FediThing @gleick @tante

That is true, and probably the strongest argument to defend LLMs. But LLMs have more explicitly encoded the material they're trained on, and better able to reproduce it than we are. Still, I think copyright is by far the weakest of the three types of theft. And I think the duplication of specific, personal styles is probably the most personal and invasive. In a way it's kind of the same thing, and yet it feels very different to me.

The theft of jobs causes the most damage, but is also kind of unavoidable with many technological advances.

Dave Rahardja Feb 20

@mcv @pluralistic @tante @FediThing @gleick @ianbetteridge Again, the concept of a technology that is “trained” by analyzing copyrighted material is not inherently bad. It’s the way that it is *developed and used* that could be morally questionable.

I bet most creators don’t mind people using AI technology to analyze and remix their works for academic or historical research, or even for search. What they mind is their use to power a Ponzi scheme that destroys human worth.

Ian Betteridge Feb 21

@mcv @pluralistic @drahardja @tante @FediThing @gleick The problem with the "theft of style" argument – and I understand it – is that if you apply the same rules to the same standards to humans then a lot of the so-called creative industries – and individual creators – would be sweating their way through court cases.

Having been threatened by Piet Mondrian's estate over a magazine cover which looked a bit Mondrian-esque, I know how style can be protected, too :) And again, whether someone creates the offending work by AI or Photoshop should make no difference.

(They were actually very nice about it, and it was more a "please don't do that again" than "your court date is next week", but still not the most fun letter I've read.)

Todd Knarr Feb 20

@gleick @pluralistic @ianbetteridge @FediThing @tante It's the regurgitation part. People read your work all the time, and are inspired by it to create their own works. I'm sure you're fine with that. But when they read your work and proceed to regurgitate chunks of it and claim it as their own? Because that's what the LLMs are doing all too often, and the reasons to object to LLMs doing it are the same as the ones to people doing it.

Clayton Slaughter Feb 26

I agree with you, I feel the same way and I have not been published.

I’m really enjoying The Three Ages of Water. I slowed down my read to really enjoy your writing and the amazing breadth of knowledge in it.

James Gleick Feb 26

@schmubba Thank you, and I liked it, too, but I didn't write it. That was @petergleick.

Clayton Slaughter Mar 1

@gleick @petergleick
Boy do I feel like an idiot 🙃
Thanks for writing a great book Peter. I enjoy your posts also James.

Alaric Snell-Pym Feb 20

@pluralistic @gleick @ianbetteridge @FediThing @tante there's been documented cases of LLMs regurgitating stuff from their training set verbatim, which clearly IS copyright infringement; and that means some parts of the training set are.encodrd in the weights of the model, which looks like publishing a copyrighted work to me. If publishing a JPEG of an image without copyright to it would be infringing, isn't publishing a model that can recreate something also infringing?

Alaric Snell-Pym Feb 20

@gleick @ianbetteridge @FediThing @tante

BUT I'm also still a fan of @pluralistic in general, although I disagree with him on some points (such as this); we have more in common than divides us, and I see too many people totally reject somebody over one thing. Sure, if that one thing is nazism, sexism, selfishness, etc - they can go straight in the bin. Something I hold a hope of arguing them around on, however, isn't cause for cancelling :-)