@tante

None of these are true if you run your own LLMs on your own hardware, using FLOSS models.

But the #MastodonHOA has deemed all AI to be abhorrent as a blanket decision.

And frankly, if you exist in a capitalist society, and you're not an owner, there is 100% chance you are exploited. The capitalist system requires it.

@crankylinuxuser FLOSS Models (which are only freeware) fulfill most of those boxes. Trained on stolen data, massaged by people in global majority countries, trained in environmentally harmful data centers, outsourcing skills to the freeware product a company dumped on me, using a tool that is imbued and trained for how big tech wants to see the world, and effort could have gone to something meaningful. So yeah nope.

@tante

"Trained on stolen data". Its at best a copyright violation. And I view things like Anna's Archive and Libgen to be internationally renowned Public Libraries.

"Massaged by people in global majority countries" - yes, people work in capitalism. And guess what... You're exploited.

"Trained in environmentally harmful data centers". This assumes that training is always needed, and its not. You can train once, and run X times. Again, you're stretching to make local LLM look horrible.

And really, the rest of these are poor excuses. I won't use poop smear(anthropic), or OpenAI, or other SaaS token companies. I run local, and does not have those things you claim.

Except for the copyright issue. But again, I dont have that much respect for current US copyright.

@crankylinuxuser @tante

Its at best a copyright violation

This may be true for published and public data... but that's not the only data that goes into these things. Any data that comes from breaches, users private cameras, and anything else stored with an expectation of privacy is much worse than a copyright violation.

@Epic_Null @crankylinuxuser @tante

Data wants to be free. This argument simply doesn't work for those of us that have always been open data, anti copyright.

@komali_2 For some reason, nobody ever brings up the other part of the quote:

On the one hand you have—the point you’re making Woz—is that information sort of wants to be expensive because it is so valuable—the right information in the right place just changes your life. On the other hand, information almost wants to be free because the costs of getting it out is getting lower and lower all of the time. So you have these two things fighting against each other.

  • Stewart Brand, at the first Hackers Conference in 1984

@clayote "value" is an overinflated term. Instrumental value? Allocation value? Preference? Attachment?

On that note arguments around capitalistic value probably aren't interesting to anarchists. By all means, debate the number of dollarydoos that should be exchanged for, lol, bits on a disk

@komali_2 If you're not interested in the monetary value of data, then you're not interested in what Stewart Brand meant when he said it wants to be free, but, well, death of the author and all that

Do you value privacy at all? If so, then you might want to find some solidarity with people who've ended up sharing more than they intended to on the open web, such as a lot of the queers and sex workers on Tumblr, who objected to ArchiveTeam scraping their blogs. Inasmuch as inanimate data can be said to "want" things, the fact that their writing is now available to any interested fascist in power in the USA is what that writing wants, but it's not what the authors want.

If your anarchism has more loyalty to the rights of data than those of the people who produced it, it's shit.

@clayote

I sympathize with the pain of anyone facing violence at the hands of fascists, and my commitment to put my body in the way of that violence remains the same, and remains proven. I've done it before, and I'll keep doing it.

That said this is why we've been telling people for the last 20 years that it's not a good idea to put PID info next to the kinds of things christo fascists target. We've been warning about *exactly this outcome*.

@clayote privacy delivered by law or contract is just privacy for corporations, not for people.

True privacy can only exist through encryption.

Encrypt things you want private, everything else should be free data available to anyone, including frontier AI companies who I am very much looking forward to the catastrophic collapse of.

@komali_2 Sure. Fine. It would be better if they hadn't put that info on the public internet in the first place. They did so, either out of naïvete, or because the blogging tools available to them didn't offer the grain of privacy control they wanted, and they made the pragmatic decision to risk exposure to the wrong people, in order to be read by the right people.

Now that their data is being abused by Anthropic, they're trying to do something to limit the harm, and are using the tools available to them, which are not necessarily the tools that they want. You, as an anarchist, should support them in that effort, and that means supporting them in getting their copyright enforced -- whether or not you think copyright should exist, in the abstract.

@komali_2 It's not that different from being against violence generally, but supporting Kurdish fighters in Rojava
@clayote there's a thread there worth exploring that I need to think about

@clayote I support them attacking corpos however they please, but I argue that using copyright to do so will be at best ineffective, at worst long term harmful.

Copyright is a tool belonging to the Capital class. Its "protections" of normal people are a part of the mythology of it so that we tolerate something absurd, the idea that only certain people (read: companies) are allowed to do things with information, stories, characters.

It won't work because corpos can just ignore it, worse case.

@clayote trying to stop a corporation from using your data to train an LLM using copyright law would be like a minor version (very, very minor) of a slave sueing a slaveholder. It probably won't work, but also it's an absurdity because the entire system exists to support the slaveholder, and will happily engage in hypocrisy and paradox to do so.

@komali_2 There are lots of ways to use the law other than suing your enemy under that same law. For instance, the Writer's Guild of America got it written into their standard contracts that the studios can't train AI on the scripts produced under those contracts.

Case law that results from litigating those contracts is still copyright law.

@clayote I'm happy for them, but

1. LLMs are still being trained on those scripts because there's no way to catch them doing so, and if they get caught they'll get away with it by arguing it was oopsie accident here's .00001% of our profit as a fine
2. Their rights will continue being degraded in a constant battle against exploitation
3. Corpos will still foolishly try to replace them with LLMs regardless

My point is that LLMs are a symptom of a far greater problem

@clayote none of that changes the fact that people should be allowed to create art around the characters and stories that comprise our cultural mythology, something copyright law prevents. Or should be allowed to do whatever we want with the software on our computers, something copyright law prevents.

@komali_2 That's an agreeable point

You replied to a post that says:

Any data that comes from breaches, users private cameras, and anything else stored with an expectation of privacy is much worse than a copyright violation.

And your reply said:

Data wants to be free. This argument simply doesn't work for those of us that have always been open data, anti copyright.

I think practically everyone who reads your reply, including people like me who turn out to agree with you, will get the impression that you're uninterested in mitigating present harms to actual people's privacy.

I'm unhappy that the State has captured all the bodies that regulate privacy, I'd like them to function independently, and any data that comes from breaches, users private cameras, and anything else stored with an expectation of privacy is much worse than a copyright violation.

@clayote I understand, you're right, my response makes it seem that way, when my intention was to respond more generally to data trained in a copyright violating way (books, lectures, speeches). I should have been more clear about that earlier, thanks for pointing that out. No wonder people are responding strongly!