Great article by the ABC's Jack Ryan talking about #generativeAI and its #copyright consequences - especially how the existing #law has failed to keep pace with #EmergingTech

My take is that we need to start thinking about #DataTrusts - where #data is kept in trust and released only for the purposes approved by data owners - and which could facilitate payment for that data ...

https://www.abc.net.au/news/science/2023-11-29/artificial-intelligence-ai-training-datasets-copyright-books3/103157980

ChatGPT and other AI models were trained on copyrighted books. Can they be 'untrained'?

Training AI models on copyrighted materials has led to an array of high-profile lawsuits against developers such as OpenAI and Meta. Can the models be "untrained" or is the genie out of the bottle?

ABC News
@KathyReid You mean, "IP rights" only matter when the right people are getting rich off of them, so "AI" has stolen everything it can get it's cyber-hands on.

@KathyReid Side note, not on topic: If we could apply the “data trust” principle to PII as well, a few of the most offensive industries would disappear pretty damn quick.

I like the idea.

@drowsygeek yes! And it would allow us to build trust with companies over time as they prove themselves worthy of trust, or remove privileges if they prove themselves unworthy.
@KathyReid Surely "data" is not synonymous with created content? The article is clearly about the latter.
That said, I'd be the first to admit I've not followed how copyright laws have been applying to data. I'm probably assuming little has changed since there was a court case over the Australian White Pages some decades ago.
IIRC that established that the collections of data could be copyrighted but not the data within them.

@geraldew this is a really good point - the distinction between created content, data and what counts as "tokens" for large language models and for other generative AI.

At what granularity can works be copyrighted?

Thank you - more to ponder!