Mastodawn

ThefuzzyFurryComrade Aug 3, 2025

On Exceptions

Limonene Aug 3, 2025

Generative AI and their outputs are derived products of their training data. I mean this ethically, not legally; I’m not a copyright lawyer.

Using the output for personal viewing (advice, science questions, or jacking off to AI porn you requested) is weird but ethical. It’s equivalent to pirating a movie to watch at home.

But as soon as you show someone else the output, I consider it theft without attribution. If you generate a meme image, you’re failing to attribute the artists whose work trained the AI without permission. If you generate code, that code infringes the numerous open source licenses of the training data, by failing to attribute it.

Even a simple lemmy text post generated by AI is derived from thousands of unattributed novels.

Show thread

gmtom

No, gen AI pictures are not dirived works of their training data. They are seperate processes. The algorithm that actually generates the image has no knowledge of the training data.

Show thread

petrol_sniff_king Aug 4, 2025

The numeric weights are derived from the training material it previously ate: they’re not extricable.

Show thread

gmtom Aug 4, 2025

The algorithms involved in the actual creation of the images are not the ones actually trained on the data. So its not at all accurate to claim they are derived.

Show thread

petrol_sniff_king Aug 4, 2025

Are you arguing that the training process has no effect on the output of the model? What on earth are they doing it for, then?

Show thread

gmtom Aug 4, 2025

Not directly no.

The training data trains an algorithm that effectively just describes an image it sees (which BTW is super useful for blind people) and gives a score for each keyword.

Then the actusl generative part takes a random background, tries to denoise it into somerthing recognisable, then shows it to thr first algorithm that gives it a score on how closely it resembles the prompts. Then does some fancy maths and performs another denoising cycle and gets another score from the first algorithm, more maths, another cycle etc. Until it spits out and image that maches the prompt.

So the algorithm that genrstes the image has no data from the training process whatsoever.

Show thread

petrol_sniff_king Aug 4, 2025

So the algorithm that genrstes the image has no data from the training process whatsoever.

It gets a, uh, score. You wrote that yourself, I don’t know how you could forget.

Show thread

gmtom Aug 4, 2025

But thats not the same as a derivative. Like saying a chart on which art styles were most popular in every decade is a derivate of every work in that survey. Because those works were used to create the data being presented.