AI is a lot like fossil fuel industry. Seizing and burning something (in this case, the internet, and more broadly, written-down human knowledge) that was built up over a long time much faster than it could ever be replenished.
I think a lot of ppl who are skeptical of the criticisms here really don't understand how it's *burning* anything because they haven't thought about how value is derived from provenance.
Suppose you have a $100M batch of medicine but an inside saboteur has put poison in a few bottles and you can't determine which ones. How much is your inventory worth? $0 - or less since you have to dispose if it too. This is because the value depended on the provenance - having a reasonable basis to believe it's what it appears to be.
For a lower stakes example, why is the organic produce in the grocery store more expensive? If you took the labels off and mixed it all together with the rest, it wouldn't be. The value at sale time is derived from a meticulous record keeping process that makes faking provenance comparable in cost to just doing it right.
Provenance of human written knowledge comes from a lot of places. Just because something was written by a human doesn't make it accurate or non-garbage. But the labor cost in producing misinformation that's hard to distinguish from meaningful writing in the same domain, together with a lot of systems we have in place, makes evaluating provenance a tractable problem.
The ability to produce unlimited amounts of plausible-looking garbage at essentially no cost, and to crowdsource that kind of vandalism to millions of randos by disguising it as something fun, destroys that capability. It's a DDoS attack on written knowledge.

I've seen folks arguing that good and accurate info can come out of "AI" too, so we can't dismiss it as garbage.

This misses the point entirely.

Even if "AI" "says" something 100% accurate, the provenance is still garbage. It's like a broken clock. It's like waiting for the nazi to say something non-offensive and saying "wow they're at least right about some things".

And quite coincidentally, all this comes right at the moment Elon burned down the greatest tool we had for realtime vetting of information across a vast range of knowledge domains through a shared domain-knowledge trust graph. We're still a long way from rebuilding that here.

So how do we move forward? We can't entirely put this shit back in the shitter. The models are large but tractable for bad actors to keep and continue using even if we somehow banned them.

But there's a lot we can do...

We can recognize the people pushing this shit as charlatans, enemies of humanity, rather than the geniuses they want us to see them as. We can stop falling for their scam of the day. We can organize to tear down their power, devalue their wealth, deliver them consequences.
We can preserve and strengthen our standards for provenance of knowledge. In particular, open and community based projects that deal with knowledge can clearly and unequivocally ban model-generated bullshit and users who try to sneak it in.

@dalias this is a decent point about LLMs and AI but it’s going to be solved within the year from the research labs, then probably another 6 months rolled into the FOSS/commercial AI tools

There’s already been decent work into figuring out where LLMs got the info from, the next step is understanding why it used those sources, then training it how to discern on which sources to value

@Techronic9876 @dalias solved in a year? That's an excellent joke! Maaaybe some good detection techniques would be viable/scalable in that sort of time range, but this is a social problem not just a technical one. Look at the whole cultural around fact checking and misinformation labelling and suppression. It wasn't doing that well with human scale misinformation and bullshit, nevermind the scale possible with the current generation of LLMs, and that BS had no trouble finding an audience so...

@alsothings @dalias there’s already really good work on arxiv on identifying which documents an LLM output comes from, and other work on letting LLMs know the probability of tokens explicitly, and then other work on the output being a system of agent LLMs

If you put all this together you have an AI that can explain itself and explain other things, down to the sources & other possibilities

I’ll be surprised if someone doesn’t have a working demo of this by fall, & an OSS project by next spring

@Techronic9876 @dalias yeah, I don't doubt your timing on the narrow technical issues you're talking about. The thing that compelled me to reply in the first place, was the bit of (probably unintentional) rhetorical fun you did where you declared the problematic deployment and use of modern LLMs would be solved in about 18 months, rather than just _providing_ a means to opt in to better systems.This is, as I said, hilariously far away from 'solving' the systematic problems enabled by these tools
@Techronic9876 @dalias In contrast, I think the systematic problems around information and knowledge sharing/ discovery that we are all now faced with are _generational_ in timescale, but would just love to be wrong