All it would take for AI to completely collapse is a ruling in the US saying these companies have to licence the content they used to train these tools.

They simply would never reach a sustainable business model if they had to fairly compensate all the people who wrote, drew, edited, sang or just created the content they use.

Simply being forced to respect attribution and licenses would kill them. Will that ruling ever happen? Maybe not. Should it? I think so.

@thelinuxEXP
They would just move to other language corpuses, no?
@lepapierblanc They would either have to pay the people who make the content, or use completely copyright free / license free material, which would basically render them pretty useless.
@thelinuxEXP What if they train it on chinese language corpus? With some Chinese state license they would be harbored against copyright claims.
@lepapierblanc @thelinuxEXP The PRC doesn't own the Chinese language. Plenty of people outside mainland China use it. If there's any chance the corpus contains anything any of them wrote, it'd be the same problem again for the LLM companies.

@lepapierblanc @thelinuxEXP

There are plenty of English corpora that are either publicly available or for which the license holders would likely be happy to partner for limited use cases. This is not some doomsday scenario for ML, it's just a doomsday scenario for Big AI.