Did you realize that we live in a reality where SciHub is illegal, and OpenAI is not?
@yabellini SciHub makes papers public that are behind paywalls. I agree, that they shouldn't be behind paywalls, but it's completely different to OpenAI.
I think they used mostly sources that are public anyway, like Wikipedia, etc. They also didn't publish them but trained an AI with it, that creates new texts. So they did a remix in a way. Remixes are handled differently in copyright law.
"The corpus [GPT-2] was trained on, […] 40 [GB] of text from URLs shared in Reddit" https://en.wikipedia.org/wiki/OpenAI
OpenAI - Wikipedia

@skylarkingmullet @yabellini so they sued OpenAI. Well people sued government for legislation of masks against Corona. Just because someone sues someone doesn't mean they are right. Let's wait for what the judges say. The second part seems to be about data protection, not copyright.