AI's aren't sentient. They can't "steal."

Programmers and institutions select the data with which to train the model. They take art and writing from artists and authors without credit or payment. The software then remixes and mimics what it is given.

Displacing agency by attributing intent to the AI is exactly how people and institutions erase human action in the creation of technology. It also leads to further perceptions of technology as acultural, unbiased, and, in essence, magical.

@Manigarm All this controversy has had me wonder: why not train the AIs with only art in the public domain? That, at least, would be less problematic, wouldn't it?

Also, they should be made to list and credit their sources, regardless.

@GraysonBell
Because taggimg correctly the licence on content that you crawl from the internet does not automate well. Or philosophically very much at all.
@Manigarm

@yacc143 @Manigarm But they don't have to blindly crawl the internet. They could set up a database and fill it with public domain art for the AI to crawl instead.

Yes, it would be more work, but from an ethical standpoint would be worth it.

@GraysonBell @yacc143 @Manigarm

You answered your own question - it would be more work. It would also have a smaller sample size, so AI art would not be as variable.

@sinboy @yacc143 @Manigarm

I did some digging, and the Smithsonian released 2.8 million pieces of art to the public domain in a database anyone can access.

The existence of such databases makes it much less work.

https://www.artdex.com/what-is-public-domain-art-2/

What is Public Domain Art? – ARTDEX

When a piece of creative work is no longer protected by copyright, it’s considered “public domain” art. Artists can lose copyright protection or the right to profit from a piece of art by surrendering or transferring it.

ARTDEX

@sinboy @GraysonBell @Manigarm Dramatically more work for a much more limited, one sided training data.

Plus conceding the precedent without even losing in court.

@yacc143 @GraysonBell @Manigarm If it ends up in court, all it will do is encourage artists to not put anything up online anymore, and take down what's there.
@sinboy @GraysonBell @Manigarm Technically, the first suit that will be setting precedents is already in court (about Github's abusing all the cool code they host to train Copilot, OTOH, their specific corporate/legal setup they used to distance Microsoft from the possible fallout might make it less useful as precedent).