Mastodawn

Cassandrich Jun 11, 2023

AI is a lot like fossil fuel industry. Seizing and burning something (in this case, the internet, and more broadly, written-down human knowledge) that was built up over a long time much faster than it could ever be replenished.

Show thread

Cassandrich Jun 12, 2023

I think a lot of ppl who are skeptical of the criticisms here really don't understand how it's *burning* anything because they haven't thought about how value is derived from provenance.

Show thread

Cassandrich Jun 12, 2023

Suppose you have a $100M batch of medicine but an inside saboteur has put poison in a few bottles and you can't determine which ones. How much is your inventory worth? $0 - or less since you have to dispose if it too. This is because the value depended on the provenance - having a reasonable basis to believe it's what it appears to be.

Show thread

Cassandrich Jun 12, 2023

For a lower stakes example, why is the organic produce in the grocery store more expensive? If you took the labels off and mixed it all together with the rest, it wouldn't be. The value at sale time is derived from a meticulous record keeping process that makes faking provenance comparable in cost to just doing it right.

Show thread

Cassandrich Jun 12, 2023

Provenance of human written knowledge comes from a lot of places. Just because something was written by a human doesn't make it accurate or non-garbage. But the labor cost in producing misinformation that's hard to distinguish from meaningful writing in the same domain, together with a lot of systems we have in place, makes evaluating provenance a tractable problem.

Show thread

Cassandrich Jun 12, 2023

The ability to produce unlimited amounts of plausible-looking garbage at essentially no cost, and to crowdsource that kind of vandalism to millions of randos by disguising it as something fun, destroys that capability. It's a DDoS attack on written knowledge.

Show thread

Cassandrich Jun 12, 2023

I've seen folks arguing that good and accurate info can come out of "AI" too, so we can't dismiss it as garbage.

This misses the point entirely.

Even if "AI" "says" something 100% accurate, the provenance is still garbage. It's like a broken clock. It's like waiting for the nazi to say something non-offensive and saying "wow they're at least right about some things".

Show thread

Cassandrich Jun 12, 2023

And quite coincidentally, all this comes right at the moment Elon burned down the greatest tool we had for realtime vetting of information across a vast range of knowledge domains through a shared domain-knowledge trust graph. We're still a long way from rebuilding that here.

Show thread

Cassandrich Jun 12, 2023

So how do we move forward? We can't entirely put this shit back in the shitter. The models are large but tractable for bad actors to keep and continue using even if we somehow banned them.

But there's a lot we can do...

Show thread

Cassandrich Jun 12, 2023

We can recognize the people pushing this shit as charlatans, enemies of humanity, rather than the geniuses they want us to see them as. We can stop falling for their scam of the day. We can organize to tear down their power, devalue their wealth, deliver them consequences.

Show thread

Cassandrich Jun 12, 2023

We can preserve and strengthen our standards for provenance of knowledge. In particular, open and community based projects that deal with knowledge can clearly and unequivocally ban model-generated bullshit and users who try to sneak it in.

Show thread

Techronic9876 Jun 12, 2023

@dalias this is a decent point about LLMs and AI but it’s going to be solved within the year from the research labs, then probably another 6 months rolled into the FOSS/commercial AI tools

There’s already been decent work into figuring out where LLMs got the info from, the next step is understanding why it used those sources, then training it how to discern on which sources to value

Show thread

Darrel Plant Jun 12, 2023

@Techronic9876 @dalias

Who's got the time for all that, though? And what about the fact that the well of information future AIs draw from is forever polluted by the previous generations?

More importantly, why wasn't the lack of sourcing seen as an issue before the fact, rather than afterward? Every authoritative source in history had footnotes, references, etc. In the digital realm, even Wikipedia has references. So why did the big brains developing AI not take provenance into account?

Show thread

Techronic9876 Jun 12, 2023

@darrelplant @dalias because researchers didn’t know LLMs would be able to chat. This was an emergent capability. They weren’t trying to build a chat bot, they were trying to build special-purpose sentiment/analysis/grammar/translation tools, and chatting took everyone by surprise. LLMs were essentially an accident

Now that they know LLMs can do zero-shot and one-shot learning, they’re working very hard on the provenance/explainability/alignment questions

Show thread

Darrel Plant Jun 12, 2023

@Techronic9876 @dalias

Pretty sure they knew they would be able to chat before they released products with names like "ChatGPT".

I've been watching attempts at chatbots develop since the late 70s. If the people building tools to write text based on language data had no inkling that their tools could fake holding a conversation, then they are very, very stupid people.

Show thread

Techronic9876 Jun 12, 2023

@darrelplant @dalias OpenAI releasing chatgpt was hugely controversial — & still is — by the people who actually discovered LLMs. OpenAI didn’t build chatgpt they just commercialized it. But once something is published research anyone can use it

Looking at how industries & governments reacted, I don’t think anything would have stopped someone from commercializing LLMs before they were ready. Best we can do now is harass/regulate new entrepreneurs to not repeat that

Show thread

Cassandrich Jun 12, 2023

@Techronic9876 @darrelplant LLMs were not "discovered". All this was known over 50 years ago. They just lacked access to the volume of text and GPUs to implement it, which capitalist asshats "solved" by scraping everyone's stuff without license or consent.

Show thread

Cassandrich Jun 12, 2023

@Techronic9876 @darrelplant That LLMs do what they do is not surprising at all to anyone with a basic understanding of probability and statistics, and really shouldn't have been a century ago, either.

Show thread

Darrel Plant Jun 12, 2023

@dalias @Techronic9876

I don't know, after all the science-fiction I read and watched, I'm really kind of surprised at how bad they are. It's 2023! Where's my jetpack?

Show thread

Cassandrich Jun 12, 2023

@darrelplant @Techronic9876 I mean if you know how they work it's not surprising.

It's also why sci-fi authors never envisioned "AI" as LLMs - because they're such a ridiculously dumb, obviously "fake" way to do AI, with no intelligence whatsoever.