Mastodawn

tante 17h ago

"On the acceptance of GenAI"
https://smallsheds.garden/blog/2026/on-the-acceptance-of-genai/

Show thread

Yet another Josh

15h ago

@tante

None of these are true if you run your own LLMs on your own hardware, using FLOSS models.

But the #MastodonHOA has deemed all AI to be abhorrent as a blanket decision.

And frankly, if you exist in a capitalist society, and you're not an owner, there is 100% chance you are exploited. The capitalist system requires it.

Show thread

tante 15h ago

@crankylinuxuser FLOSS Models (which are only freeware) fulfill most of those boxes. Trained on stolen data, massaged by people in global majority countries, trained in environmentally harmful data centers, outsourcing skills to the freeware product a company dumped on me, using a tool that is imbued and trained for how big tech wants to see the world, and effort could have gone to something meaningful. So yeah nope.

"Trained on stolen data". Its at best a copyright violation. And I view things like Anna's Archive and Libgen to be internationally renowned Public Libraries.

"Massaged by people in global majority countries" - yes, people work in capitalism. And guess what... You're exploited.

"Trained in environmentally harmful data centers". This assumes that training is always needed, and its not. You can train once, and run X times. Again, you're stretching to make local LLM look horrible.

And really, the rest of these are poor excuses. I won't use poop smear(anthropic), or OpenAI, or other SaaS token companies. I run local, and does not have those things you claim.

Except for the copyright issue. But again, I dont have that much respect for current US copyright.

Show thread

Epic Null 15h ago

@crankylinuxuser @tante

Its at best a copyright violation

This may be true for published and public data... but that's not the only data that goes into these things. Any data that comes from breaches, users private cameras, and anything else stored with an expectation of privacy is much worse than a copyright violation.

Show thread

Yet another Josh

@Epic_Null @tante

And yes, that is a big issue with the SaaS token vendors. Claude, OpenAI, MS, and the rest do use whatever user data they can get. I am not arguing their horrific behavior.

I'm talking about locally running Qwen, or Deepseek, or other FLOSS models.

That local LLM running on my machine only sees and uses data I provide. And a control-c in the relevant console window kills the LLM.

What folks do not realize is this is #Leibniz's ultimate dream, of being able to do #calculus with words, sentences, and more. He tried to do single word-vectors, but even that had to wait for Word2Vec in 2012.

Show thread

Grant 13h ago

@Epic_Null @crankylinuxuser @tante “local” models are as reliant on illegal data acquisition, because they depend on the larger mainstream models to reach any level of tolerable performance. Whether it’s for training, fine tuning, distillation, or another method, that dependency means anything that goes into the development of the nonlocal model is also a requirement for the development of the local versions.

Deepseek and Qwen are no exception.