"…a damning new study could put #AI companies on the defensive. In it, #Stanford and #Yale researchers found compelling evidence that #AImodels are actually copying all that data, not “learning” from it. Specifically, four prominent LLMs — OpenAI’s GPT-4.1, Google’s Gemini 2.5 Pro, xAI’s Grok 3, and Anthropic’s Claude 3.7 Sonnet — happily #reproduced lengthy excerpts from #popular — and #protected#works, with a stunning degree of #accuracy."

https://futurism.com/artificial-intelligence/ai-industry-recall-copyright-books

Researchers Just Found Something That Could Shake the AI Industry to Its Core

Researchers found compelling evidence that AI models are actually copying copyrighted data, not "learning" from it.

Futurism

@josemurilo
Of course they are. It's a total lie to use the phrase "learning". It's analogous to a distributed data flow database. It's why it needs so much RAM for reasonable performance.

They are plagiarism machines that only give useful results when regurgitating.

Better real search engines pointing to real source would be honest and more useful.

Simple fines or compensation isn't enough. They need to be opt-in only for content and all illicitly obtained content / models deleted.