What will LLM owners do when they've destroyed or hollowed out all human sources of knowledge and places of education/research?

What happens when they have no more human blood to suck on to prevent model collapse?

More importantly, what will society do if it builds its structures around stealing knowledge from people until the people give up producing knowledge?

(And this is ignoring LLMs' impact on our climate due to their vast energy use.)

#AI #LLMs

If by โ€œ#LLM ownersโ€ you mean the owners of the corporations that produce those models @FediThing, I think they won't notice nor much care about that threshold.

Their primary internal directive is evidently โ€œjam as much data into the training funnel as possible, doesn't matter where it comes fromโ€. Given this directive, their underlings are already deploying aggressive botnets to scrape the entire web, repeatedly, without heed for any resource limits.

https://lwn.net/Articles/1008897/

So I think they'll just keep pressing that accelerator, and not really notice nor much care when the web is dead.

Fighting the AI scraperbot scourge

There are many challenges involved with running a web site like LWN. Some of them, such as fin [...]

LWN.net

@bignose

What do they do when they run out of stuff to scrape because they've destroyed it? Or if the quality of what is left is negligible?

The death of the web won't mean the *absence* of the web; it just means it'll predominantly be auto-generated slop. So, in that scenario, they won't run out of stuff to scrape.

As for the quality of what's on that web? They show no sign at all of caring about the quality of what's on *today's* web, they just scrape all of it continually. I don't think they'll notice nor care when it's mostly slop; they'll just endlessly scrape the dead web and feed their machine on it.

Will that make a difference? They don't seem to care that today's #LLM output is mostly crap, so it's hard to say why they'd care if it declined.

@FediThing

@bignose

"The death of the web won't mean the *absence* of the web; it just means it'll predominantly be auto-generated slop. So, in that scenario, they won't run out of stuff to scrape."

Then you get model collapse though, if there's no human input any more?

@FediThing, yes. And the people who can see that coming, have (by intentional arrangement) no connection to the people with their foot on the accelerator. So the awareness that they're heading for model collapse will not be sufficient to stop them.