zooming out a little bit, it does feel alarming to me that a lot of people whose stated politics are progressive or socialist or both are willing to give huge tech companies an easy ride for fully seizing the means of production for everyone, no matter where you personally work

@jcoglan what's the alternative?

It turns out LLMs are pretty easy to build now that we know how to do it. 5TB of data (not difficult to obtain) and a few millions of dollars in compute electricity turns out to do the job.

@simon I'm hearing "what's the alternative" a lot recently and if I took that attitude to very many things I would have to stop believing in anything
@jcoglan my chosen alternative is to try and teach people how to use these things productively and responsibly in a way that adds more value than it takes away

@simon @jcoglan I’m already getting stopped at “5TB data is easy to obtain” (without consent). There is no “responsibly” for me after that.

But even if it were, there are so many more things wrong with all this that I have a hard time understanding how anyone uses them at all outside of their manager tells them to because investments were made.

But that’s me, I’ve also never ridden an Uber. I must be holding things wrong.

@janl @simon @jcoglan serious question: what consent is required to scan every digitized work of art that is in public domain or to read the data from CommonCrawl?
@raphael @simon @jcoglan you are carving out an exception that is not relevant to my argument. It is extremely well documented that most popular LLMs have been trained on otherwise copyrighted materials and reproduce those in ways that is likely not covered by fair use (but I don’t have much hope for a legal argument, so moral it remains.

@janl @simon @jcoglan

But then your argument is not against LLMs in general, just this bad crop given by Big Tech.

@raphael @simon @jcoglan I struggle hard to separate the tool from the maker here. I think doing so is disingenuous even in the best light.

@janl @simon

I don't think the problem is in the tool itself. The troubling part to me is what @jcoglan
mentioned.

If all the "anti-AI" crowd focused their criticism and opposition on the corporations that are trying to monopolize and seek rent out of the whole world's information, it would be easier (I think) to gather more people on their side.

@raphael @janl @simon @jcoglan "your problem is not with the abstract concept of this tool, just every implementation of it that actually exists in practice" is not a particularly persuasive argument IMO.

@benjamineskola @janl @simon @jcoglan

If you problem is with the abstract concept of the tool, then no possible implementation will ever be morally acceptable, no matter the upside.

If your problem is "only" with the existing implementations, then it follows that there exists a theoretical implementation which can be accepted and bring the upside without the downsides.

@raphael @janl @simon @jcoglan not sure there’s much point in spending time thinking about hypothetical future implementations tbh. Until such a thing exists then we have to consider that all of the existing ones have these issues; even if it’s potentially possible to solve the issues nobody has actually done so and nobody seems about to do so (as far as I’m aware).

(And even if such an ‘ethical’ LLM did exist, I think it would be valid to remain concerned about the huge market share of unethical ones.)

@benjamineskola @janl @simon @jcoglan

1) No one is saying "we should not be concerned about the dominance of the unethical companies". We should, we are.

2) If we don't clear establish the line between what is acceptable and what is not acceptable, good actors will never be involved and bad actors will never have competition from good actors, and will be validated in doing the bad things they do.

@benjamineskola @simon @jcoglan

3) If we treat all and any usage of LLMs as equally morally unacceptable, we end up with "abstinence-only is the only acceptable sex-ed policy".

@raphael @simon @jcoglan This is a silly analogy because right now abstinence really is the only acceptable option, because as I think we've established the only existing options are unethical ones.

(edit: or if not the *only*, the only ones available to the vast majority of users)

@benjamineskola @simon @jcoglan

- You don't see many parents encouraging their teenage kids to have sex, but they still should have "the talk" with them anyway.

- Shooting heroin is still damaging to people. Giving needles and heroin to addicts is still "unethical". Yet, we still have "damage reduction" policies where city clinics give clean needles and drugs to addicts.

- Even Richard Stallman advocated it was okay to use closed software to develop FOSS alternatives (e.g, C compilers)

@raphael None of these analogies are remotely relevant.

@benjamineskola

Can you stipulate what would pass for an "ethical" LLM?

@raphael No. That was discussed already.

@benjamineskola

This is just obscurantism. It's opposition for the sake of opposition, which makes it super easy to be ignored by bad actors.

@raphael No it's not. If you got this far in the thread without bothering to figure out what the people you were arguing with actually meant, that's on you.

@benjamineskola

Help me here, then:

- Complaints about LLMs using copyrighted data to be trained. Yes, legitimate complaint! Doesn't this mean that a LLM trained on public data addresses this complaint? And if this is your single major issue with LLMs, would you be okay using a copyright-free LLM?

@benjamineskola

- Complaints about LLM being used by "Big Tech" to seize the means of production. Perfectly valid complaint! Doesn´t this mean that if we regulate big tech out of existance, this concern would be mitigated? Would it then be okay to use LLMs if they are *empowering people* instead of empowering capital?

ben (@benjamineskola@hachyderm.io)

@raphael@communick.com @janl@narrativ.es @simon@simonwillison.net @jcoglan@mastodon.social "your problem is not with the abstract concept of this tool, just every implementation of it that actually exists in practice" is not a particularly persuasive argument IMO.

Hachyderm.io

@benjamineskola

You are implying that my argument is "people should be okay with the bad tool just because there is an idealized version of it", which is a non-sensical strawman.

My argument is "this version of the tool had obvious issues and we shouldn't accept it just because of its upsides. That does not exclude the possibility of continuing to develop a better version of this tool that does not have those issues".

@raphael The whole point is that in the real world every single way of using this tool has these issues, and fantasising about a hypothetical alternate universe where these issues don't exist does nothing to solve them.

@benjamineskola

Every technology has benefits and drawbacks. If the mere acknowledgement of the issues should lead to abandoning attempts to improve it, we wouldn't do anything at all.

Mind you: this also does not mean that tech corporations get a free pass to keep doing what they are doing. Quite the contrary - like I said elsewhere, I think that any "AI-based" product should be either forced to be copyleft-licensed or 100% taxed.

@raphael The thread was talking right from the beginning about the actual real world and actually-existing implementations. You insist on bringing up imaginary implementations that would hypothetically be free of these problems; it's a complete distraction from a discussion of the real issues affecting the real implementations.

@benjamineskola

"Actual implementations" have issues. That much is clear. Then what?

"Just don't use it" is not a real solution. You are not going to morally lecture 300 million users out of ChatGPT. It's the "abstinence only" policy.

"The government should intervene". Ok. How?

@raphael I don't know that any of what you're saying relates to anything I said or anything that anyone else said in this thread. Have a nice day.

@benjamineskola

I don't know where the thread starts for you, but to me it was https://mastodon.social/@jcoglan/114621032986112711

Seems like you got one part of the conversation, took it out of context and then got set on arguing a point I never tried to make.

@raphael Yes, that's exactly what I'm referring to. None of what you've said seems to relate at all to any of the things that anyone else is saying. I'm not sure why you're so set on defending big tech (which is what you're doing in practice even if you'd like to believe otherwise) but it's gotten boring.

@benjamineskola

How is "any product that uses AI should be copyleft or 100% taxed" in favor of Big Tech?

How is "break all corporations so that no company is bigger than 150 employees" in favor of Big Tech?

@raphael As far as I can tell, this is the first time you've mentioned these things in this thread, and since I'm not capable of reading minds I didn't actually mean that.

Deflecting the conversation from actual real-world LLMs and LLM usage to a hypothetical 'good' LLM serves only to serve the interests of big tech. When someone says "LLMs are bad because they're under the control of big tech", and you reply by claiming that hypothetically a future LLM might not be under the control of big tech, all that does is distract from the real issue by focusing on imaginary problems and imaginary solutions.