Pretty fucking bold of these scientists to scrape one of my copyrighted photographs off the internet and then re-release it uncredited under a Creative Commons license because they used it for training data for an algorithm.
Pretty fucking bold of these scientists to scrape one of my copyrighted photographs off the internet and then re-release it uncredited under a Creative Commons license because they used it for training data for an algorithm.
Using copyrighted images to train algorithms is a kind of grey area, and I can see some decent arguments in favor of either.
But you can't act as someone else's agent and distribute their work without permission.
@alexwild I'm a big fan of the capabilities that AI tools are bringing and think that in most cases (excluding overfitting) they're a classic example of the purpose of fair use exemptions.
But you're right, THIS sort of behavior is not at all "that". It's just nicking an uncredited photo for your paper and presenting it as yours.
@nero I actually think all camera equipement should be free.
But it isn't.
@bhawthorne @questauthority Lol, no.
MDPI is a for-profit publisher; this is not classroom use, and it is not educational *about my image*, which is not even credited.
Your arguments about "actual organisms in situ" is just dumb, really, since you have no idea. It was a 2 hour studio session I had to arrange, including sets and lighting.
@alexwild @bhawthorne @questauthority thats the main problem in these discussions. The 4 points are just interpretations of how it is handled in the US. But even these are rules would be handled differently in Europe.
And thats the main thing with fair use. It's always arguable. And in case of AI it probably needs a court decission.
Unlikely, IMO. I'm very familiar with the factors. The major distinction from the other thumbnail cases is that those indexed and linked back to the original sources. Absent that, several of the factors are likely to fall out differently.
@alexwild Might be me but it's not a gray area at all. You use a product to create yours, you pay.
You do groceries to create a dish, you pay for the groceries. Same thing.
@alexwild: If they released it as a part of the input dataset, the algorithm grey area doesn't really apply, but copyright law also has exceptions for scientific and research uses, and this particular way of use might actually fall into the grey area surrounding those.
At the very least, I'd think you should be eligible for a proper credit, though. Perhaps write to the authors and describe the situation?
Contact the journal that published the accompanying paper. Most have ethics teams that look into this.
@Gremriel @nero @alexwild The problem with asking on a project like that is that you need like, thousands upon thousands of pictures in order to constitute even a "small" dataset. They don't even have time to curate these things (a lot of porn ends up in them too), because it's not feasible for a small research team to go through each one and check.
When they want to monetize these things though, they really ought to spend the time and money to ethically source: they keep skipping that step.
as is often the case, where the data came from isn't properly documented
@alexwild "available via license: Creative Commons Attribution 4.0 International
Content may be subject to copyright."
Do they have any concept how a) licenses b) copyright works?
Yikes - yes, this is very much infringement.
Not even a little bit.
@alexwild Weird way to spell "legally perilous AF"
They added a nice "Content may be subject to copyright", but didn't think about what those words mean?
I assume demand letter incoming, along an offer to settle for statutory damages?
@alexwild IANAL but I think as long as they only re-distribute it as a thumbnail (technically a citation) in the paper it's legally covered by the "research and scholarship" exceptions to copyright.
However, when it comes to the use as "training data" in commercial settings, I think we're in completely uncharted waters.
Probably thought no one would notice.