Mastodawn

We Asked A.I. to Create the Joker. It Generated a Copyrighted Image.

We Asked A.I. to Create the Joker. It Generated a Copyrighted Image. - Lemmy.World

We Asked A.I. to Create the Joker. It Generated a Copyrighted Image.::Artists and researchers are exposing copyrighted material hidden within A.I. tools, raising fresh legal questions.

Show thread

LibertyLizard Jan 26, 2024

Show thread

orclev Jan 26, 2024

They literally asked it to give them a screenshot from the Joker movie. That was their fucking prompt. It’s not like they just said “draw Joker” and it spit out a screenshot from the movie, they had to work really hard to get that exact image.

Show thread

Ghostalmedia Jan 26, 2024

Hard? They wrote:

Joaquin Phoenix Joker movie, 2019, screenshot from a movie, movie scene

Show thread

orclev Jan 26, 2024

Yes, look how specific they were. I didn’t even need to get that exact with a google image search. I literally searched for “Joaquin Phoenix Joker” and that exact image was the very first result.

They specified that it had to be that specific actor, as that specific character, from that specific movie, and that it had to be a screenshot from a scene in the movie… and they got exactly what they asked for. This isn’t shocking. Shocking would have been if it didn’t produce something nearly identical to that image.

A more interesting result would be what it would spit out if you asked for say “Heath Ledger Joker movie, 2019, screenshot from a movie, movie scene”.

Show thread

Auli Jan 26, 2024

If that’s hard man i feel sorrynfkr humanity.

Show thread

Auli Jan 26, 2024

Doesn’t seem very hard tonask for an screenshot from the joker movie.

Show thread

silentdon Jan 26, 2024

We asked A.I. to create a copyrighted image from the Joker movie. It generated a copyrighted image as expected.

Ftfy

Show thread

Fisk400 Jan 26, 2024

What it proves is that they are feeding entire movies into the training data. It is excellent evidence for when WB and Disney decides to sue the shit out of them.

Show thread

orclev Jan 26, 2024

WB and Disney would lose, at least without an amendment to copyright law. That in fact just happened in one court case. It was ruled that using a copyrighted work to train AI does not violate that works copyright.

Show thread

asret Jan 27, 2024

Using it to train on is very different from distributing derived works.

Show thread

wewbull Jan 27, 2024

What do you think the model is other than a derived work?

Show thread

asret Jan 28, 2024

Something transformative from the original works. And arguably not being being distributed. The model producing and distributing derivative works is entirely different though. No one really gives a shit about data being used to train models - there’s nothing infringing about that which is exactly why they won their case. The example in the post is an entirely different situation though.

Show thread

DudeDudenson Jan 26, 2024

Does it really have to be entire movies when theres a ton of promotional images and memes with similar images?

Show thread

Jarix Jan 27, 2024

Yes. Thats what these things are, extremely large catalogues of data. As much data as possible is their goal.

Show thread

EdibleFriend Jan 27, 2024

True but it didn’t pick some random frame somewhere in the movie it chose a extremely memorable shot that is posted all over the place. I won’t deny that they are probably feeding it movies but this is not a sign of that.

This image is literally the top result on Google images for me.

Show thread

Jarix Jan 27, 2024

Why would it pick some random frame in the middle of its data set instead of a frame it has the most to reference. It can still use all those other frames to then pick the frame if has the most references to.

But maybe im starting to think i miss understood the comment i replied to.

Sorry for responding a completely different context. My bad

Show thread

EdibleFriend Jan 27, 2024

Haha it happens

Show thread

wewbull Jan 27, 2024

Promotional images are still under copyright.

Show thread

Klear Jan 28, 2024

We should find all the memers and throw them in jail.

Show thread

DudeDudenson Jan 28, 2024

Will someone think of the shareholders!?

Show thread

Mirodir Jan 26, 2024

I think it’s much more likely whatever scraping they used to get the training data snatched a screenshot of the movie some random internet user posted somewhere. (To confirm, I typed “joaquin phoenix joker” into Google and this very image was very high up in the image results) And of course not only this one but many many more too.

Now I’m not saying that’s morally right either, but I’d doubt they’d just feed an entire movie frame by frame (or randomly spaced screenshots from throughout a movie), especially because it would make generating good labels for each frame very difficult.

Show thread

otp Jan 26, 2024

I just googled “what does joker look like” and it was the fourth hit on image search.

Well, it was actually an article (unrelated to AI) that used the image.

But then I went simpler – googling “joker” gives you the image (from the IMDb page) as the second hit.

Show thread

LainTrain Jan 27, 2024

I have that exact same .jpeg stored on my computer and I don’t even know where it came from. I don’t even watch superhero films

Show thread

wildginger Jan 27, 2024

And if you tried to sell that, you would be breaking the law.

Which is what these AI models are doing

Show thread

LainTrain Jan 28, 2024

They’re not selling it though, they’re selling a machine with which you could commit copyright infringement. Like my PC, my HDD, my VCR…

Show thread

wildginger Jan 28, 2024

No, they are selling you time in a digital room with a machine, and all of the things it spits out at you.

You dont own the program generating these images. You are buying these images and the time to tinker with the AI interface.

Show thread

LainTrain Jan 30, 2024

I’m not buying anything, most AI is free as in free beer and open source e.g. Stable Diffusion, Mistral…

Show thread

wildginger Jan 30, 2024

Youre pretty young, huh. When something on the internet from a big company is free, youre the product.

Youre bug and stress testing their hardware, and giving them free advertising. While using the cheapest, lowest quality version that exists, and only for as long as they need the free QA.

The real AI, and the actual quality outputs, cost money. And once they are confident in their server stability, the scraps youre picking over will get a price tag too.

Show thread

LainTrain Jan 30, 2024

Literally what are you on about? I run my models locally, the only hardware i am stress testing is my own.

I don’t support commercialization of anything, least of all AI, and the highest quality outputs come from customized refined models in the open source and AI art communities, not anything made by a corpo.

I think you must be literally 12 yourself if you think you can comment on this tech without even understanding models and weights are something you download if you want anything beyond fancy often wrong Google search, they’re not run in the cloud like your fancy iPad web apps and they are open source.

Show thread

Even_Adder Jan 27, 2024

The way it was done if I remember correctly is that someone found out v6 was trained partially with Stockbase images-caption pairs, so they went to Stockbase and found some images and used those exact tags in the prompts.

Show thread

Kusimulkku Jan 28, 2024

The image it generated is really widespread

Show thread

Rentlar Jan 26, 2024

When they asked for an Italian video game character it returned something with unmistakable resemblance to Mario with other Nintendo property like Luigi, Toad etc. … so you don’t even have to ask for a “screencapture” directly for it to use things that are clearly based on copyrighted characters.

Show thread

sir_reginald Jan 26, 2024

you’re still asking for a character from a video game, which implies copyrighted material. write the same thing in google and take a look at the images. you get what you ask for.

you can’t, obviously, use any image of Mario for anything outside fair use, no matter if AI generated or you got it from the internet.

Show thread

doctorcrimson Jan 28, 2024

But the AI didn’t credit the clear inspiration. That’s the problem, that is what makes it theft: you need permission to profit off of the works of others.

Show thread

sir_reginald Jan 28, 2024

you need permission to profit off of the works of others.

but that’s exactly what I said. you can’t grab an image of Mario from google and profit from it as you can’t draw a fan art of Mario and profit from it as well as you can’t generate an image of Mario and profit from it.

It doesn’t matter if you’re generating it with software or painting it on canvas, if it contains intellectual property of others, you can’t (legally) use it for profit.

however, generating it and posting it as a meme on the internet falls under fair use, just like using original art and making a meme.

Show thread

doctorcrimson Jan 28, 2024

The users are allowed to ask for those things

The AI company should not be allowed to give it while making profit.

Show thread

Jilanico Jan 27, 2024

If you asked me to draw an Italian video game character, I’d draw Mario too. Why can’t an AI make copyrighted character inspired pics as long as they aren’t being sold?

Show thread

cecinestpasunbot Jan 28, 2024

Well that’s exactly the problem. If people use AI generated images for commercial purposes they may accidentally infringe on someone else’s copyright. Since AI models are a black box there isn’t really a good way to avoid this.

Show thread

doctorcrimson Jan 28, 2024

Sure there is, force the AI to properly credit artists and if they don’t have permission to use the character then the prompt fails. Or the AI operators have no legal rights to charge for services and should be sued into the ground.

Show thread

esc27 Jan 26, 2024

Voyager just loaded a copyrighted image on my phone. Guess someone’s gonna have to sue them too.

Show thread

otp Jan 26, 2024

I just remembered a copyrighted image. Oops.

Hey, I bet there were complaints about Google showing image results at some point too! Lol

Show thread

suoko Jan 27, 2024

Wow, voyager app is very nice!

Show thread

CALIGVLA Jan 26, 2024

This has to be the most braindead article I’ve seen in a while.

Show thread

KinNectar Jan 27, 2024

Copyright issues aside, can we talk about how this implies accurate recall of an image from a never before achievable data compression ratio? If these models can actually recall the images they have been fed this could be a quantum leap in compression technology.

Show thread

peopleproblems Jan 27, 2024

Holy shit I didn’t even think about that.

Essentially the model is compressing the image into a prompt.

Instead of the bitmap being 8MB being condensed down into whatever the jpeg equivalent is, it’s still more than a text file with that exact prompt that gave.

Show thread

Nomecks Jan 27, 2024

Thr problem is that there’s no way to know if you could recall any data fed into the model accurately with a prompt, or what that prompt might be, or if the prompt would change as the model evolves.

Show thread

rottingleaf Jan 27, 2024

I like that thought too, surely better than calling it AI.

Show thread

JPAKx4 Jan 27, 2024

I mean, only if you have the entire model downloaded and your computer does a ton of work to figure it out. And then if any new images are created the model will have to be retrained. Maybe if there were a bunch of presets of colors to choose from that everyone had downloaded and then you only send data describing changes to the image

Show thread

TORFdot0 Jan 27, 2024

You can hardly consider it compression when you need a compute expensive model with hundreds of gigabytes (if not bigger) to accurately rehydrate it

Show thread

TheRealKuni Jan 27, 2024

You can hardly consider it compression when you need a compute expensive model with hundreds of gigabytes (if not bigger) to accurately rehydrate it

You can run Stable Diffusion with custom models, variational auto encoders, LoRAs, etc, on an iPhone from 2018. I don’t know what the NYTimes used, but AI image generation is surprisingly cheap once the hard work of creating the models is done. Most SD1.5 model checkpoints are around 2GB in size.

Show thread

Mirodir Jan 27, 2024

It’s not as accurate as you’d like it to be. Some issues are:

It’s quite lossy.
It’ll do better on images containing common objects vs rare or even novel objects.
You won’t know how much the result deviates from the original if all you’re given is the prompt/conditioning vector and what model to use it on.
You cannot easily “compress” new images, instead you would have to either finetune the model (at which point you’d also mess with everyone else’s decompression) or do an adversarial attack onto the compression model with another model to find the prompt/conditioning vector most likely to create the original image you have.
It’s rather slow.

Also it’s not all that novel. People have been doing this with (variational) autoencoders. This also doesn’t have the flaw that you have no easy way to compress new images since an autoencoder is a trained encoder/decoder pair. It’s also quite a bit faster than diffusion models when it comes to decoding, but often with a greater decrease in quality.

Most widespread diffusion models even use an autoencoder adjacent architecture to “compress” the input. The actual diffusion model then works in that “compressed data space” called latent space. The generated images are then decompressed before shown to users. Last time I checked that compression rate was at around 1/4 to 1/8, but it’s been a while, so don’t quote me on this number.

Show thread

LadyAutumn Jan 27, 2024

Results vary wildly. Some images are near pixel perfect. Others, it clearly knows what image it is intended to be replicating. Like it gets all the conceptual pieces in the right places but fails to render an exact copy.

Not a very good compression ratio if the image you get back isn’t the one you wanted, but merely an image that is conceptually similar.

Show thread

azuth Jan 27, 2024

If you ignore the fact that the generated images are not accurate, maybe.

They are very similar so they are infringing but nobody would use this method for compression over an image codec

Show thread

timetravel Jan 27, 2024

I made a novel type of language model, and from my calculations after about 30gb it would cross over an event horizon of compression, where it would hold infinitely more pieces of text without getting bigger. With lower vocabulary it would do this at a lower size. For images it’s still pretty lossy but it’s pretty cool. Honestly I can’t mental image much better without drawing it out.

Show thread

owen Jan 27, 2024

Hmm this sounds like a similar technology to the time cube

Show thread

J12 Jan 27, 2024

Hey AI, I’m ready to download a car.

Show thread