AI's aren't sentient. They can't "steal."

Programmers and institutions select the data with which to train the model. They take art and writing from artists and authors without credit or payment. The software then remixes and mimics what it is given.

Displacing agency by attributing intent to the AI is exactly how people and institutions erase human action in the creation of technology. It also leads to further perceptions of technology as acultural, unbiased, and, in essence, magical.

@Manigarm All this controversy has had me wonder: why not train the AIs with only art in the public domain? That, at least, would be less problematic, wouldn't it?

Also, they should be made to list and credit their sources, regardless.

@GraysonBell
Because taggimg correctly the licence on content that you crawl from the internet does not automate well. Or philosophically very much at all.
@Manigarm

@yacc143 @Manigarm But they don't have to blindly crawl the internet. They could set up a database and fill it with public domain art for the AI to crawl instead.

Yes, it would be more work, but from an ethical standpoint would be worth it.

@GraysonBell @yacc143 @Manigarm

You answered your own question - it would be more work. It would also have a smaller sample size, so AI art would not be as variable.

@sinboy @yacc143 @Manigarm

I did some digging, and the Smithsonian released 2.8 million pieces of art to the public domain in a database anyone can access.

The existence of such databases makes it much less work.

https://www.artdex.com/what-is-public-domain-art-2/

What is Public Domain Art? – ARTDEX

When a piece of creative work is no longer protected by copyright, it’s considered “public domain” art. Artists can lose copyright protection or the right to profit from a piece of art by surrendering or transferring it.

ARTDEX

@sinboy @GraysonBell @Manigarm Dramatically more work for a much more limited, one sided training data.

Plus conceding the precedent without even losing in court.

@yacc143 @GraysonBell @Manigarm If it ends up in court, all it will do is encourage artists to not put anything up online anymore, and take down what's there.
@sinboy @GraysonBell @Manigarm Technically, the first suit that will be setting precedents is already in court (about Github's abusing all the cool code they host to train Copilot, OTOH, their specific corporate/legal setup they used to distance Microsoft from the possible fallout might make it less useful as precedent).

@GraysonBell

Honestly, because the approach wouldn't solve the underlying issue. With machine learning algorithms data science can train
and then actually develop "original works of art" using the same style/flavor/voice of the original.

Essentially, we can teach a robot to think/do the same things we believe Grayson Bell would think/do in the same scenario.

Unlike humans, computers can't make leaps of intuition. So you can know that fire is hot even though you've never touched it. AI/ML can't do that - it pretends by sheer brute force simulation. The computer asks itself twenty million times how something could be and records what answers it came up with and then references that sheet whenever you ask it a question.

The sheer amount of effort from humans to even train the smallest decisions is substantial. Entire companies
exist to just make the concept of "I asked an AI to read all of XYZ" possible within a 'reasonable' timeframe.

Ethics always fall behind the technology curve. Business doesn't want to invest in morals - they want to invest in profits. Morals and ethic business practices are what happens AFTER someone gets the first.

@Manigarm

@mentallyalex @GraysonBell @Manigarm
When you get down to the bottom layer, AI is all just adding and subtracting ones and zeroes, really fast.
That's not exactly what goes on in nervous tissue.