zooming out a little bit, it does feel alarming to me that a lot of people whose stated politics are progressive or socialist or both are willing to give huge tech companies an easy ride for fully seizing the means of production for everyone, no matter where you personally work

@jcoglan what's the alternative?

It turns out LLMs are pretty easy to build now that we know how to do it. 5TB of data (not difficult to obtain) and a few millions of dollars in compute electricity turns out to do the job.

@simon I'm hearing "what's the alternative" a lot recently and if I took that attitude to very many things I would have to stop believing in anything
@jcoglan my chosen alternative is to try and teach people how to use these things productively and responsibly in a way that adds more value than it takes away

@simon @jcoglan I’m already getting stopped at “5TB data is easy to obtain” (without consent). There is no “responsibly” for me after that.

But even if it were, there are so many more things wrong with all this that I have a hard time understanding how anyone uses them at all outside of their manager tells them to because investments were made.

But that’s me, I’ve also never ridden an Uber. I must be holding things wrong.

@janl @simon @jcoglan serious question: what consent is required to scan every digitized work of art that is in public domain or to read the data from CommonCrawl?
@raphael @simon @jcoglan you are carving out an exception that is not relevant to my argument. It is extremely well documented that most popular LLMs have been trained on otherwise copyrighted materials and reproduce those in ways that is likely not covered by fair use (but I don’t have much hope for a legal argument, so moral it remains.

@janl @simon @jcoglan

But then your argument is not against LLMs in general, just this bad crop given by Big Tech.

@raphael @simon @jcoglan I struggle hard to separate the tool from the maker here. I think doing so is disingenuous even in the best light.

@janl @simon

I don't think the problem is in the tool itself. The troubling part to me is what @jcoglan
mentioned.

If all the "anti-AI" crowd focused their criticism and opposition on the corporations that are trying to monopolize and seek rent out of the whole world's information, it would be easier (I think) to gather more people on their side.

@raphael @janl @simon @jcoglan "your problem is not with the abstract concept of this tool, just every implementation of it that actually exists in practice" is not a particularly persuasive argument IMO.

@benjamineskola @janl @simon @jcoglan

If you problem is with the abstract concept of the tool, then no possible implementation will ever be morally acceptable, no matter the upside.

If your problem is "only" with the existing implementations, then it follows that there exists a theoretical implementation which can be accepted and bring the upside without the downsides.

@raphael @janl @simon @jcoglan not sure there’s much point in spending time thinking about hypothetical future implementations tbh. Until such a thing exists then we have to consider that all of the existing ones have these issues; even if it’s potentially possible to solve the issues nobody has actually done so and nobody seems about to do so (as far as I’m aware).

(And even if such an ‘ethical’ LLM did exist, I think it would be valid to remain concerned about the huge market share of unethical ones.)

@benjamineskola @janl @simon @jcoglan

1) No one is saying "we should not be concerned about the dominance of the unethical companies". We should, we are.

2) If we don't clear establish the line between what is acceptable and what is not acceptable, good actors will never be involved and bad actors will never have competition from good actors, and will be validated in doing the bad things they do.

@benjamineskola @simon @jcoglan

3) If we treat all and any usage of LLMs as equally morally unacceptable, we end up with "abstinence-only is the only acceptable sex-ed policy".

@raphael @simon @jcoglan This is a silly analogy because right now abstinence really is the only acceptable option, because as I think we've established the only existing options are unethical ones.

(edit: or if not the *only*, the only ones available to the vast majority of users)

@benjamineskola @simon @jcoglan

- You don't see many parents encouraging their teenage kids to have sex, but they still should have "the talk" with them anyway.

- Shooting heroin is still damaging to people. Giving needles and heroin to addicts is still "unethical". Yet, we still have "damage reduction" policies where city clinics give clean needles and drugs to addicts.

- Even Richard Stallman advocated it was okay to use closed software to develop FOSS alternatives (e.g, C compilers)

@benjamineskola @simon @jcoglan

It used to be that one of the main arguments against all and any blockchain technology was "It is burning the planet".

It took 7 years for the Ethereum team to finally develop a system that was robust enough and could get them out of Proof-of-Work. Nowadays, the Ethereum network secures billions of dollars worth of assets while using less electricity than all the videogame consoles *at idle*, yet we still have people using the same tired (and wrong) argument.