@baldur Trust is all fine, but what counts is the license. In this case, our content is under CC-BY-SA. What seems relevant to me:
- using the content for AI training does (unfortunately?) not trigger the attribution requirement ("fair use", bla bla)
- it should be feasible to pull of a fork of Stack overflow, with a legal copy of all existing content
Awesome, now we just have to figure out how to defend the archive.org from the statist bullshit when the FBI comes for them for violating copyright law. 😩
using the content for AI training does (unfortunately?) not trigger the attribution requirement
No matter what Creative Commons are hallucinating, SO/SE are sidestepping this entirely by relying on the commercial dual licence in the ToS which they sneakily did not limit to as needed to run the platform itself.
@pixelistik @baldur I don’t understand what you’re asking/saying?
There’s a public data dump of SO/SE under CC-BY-SA which people can use.
(Codidact have imported some sites early on but later found that untenable; active Q&A sites tend to work better if they only have their own, active, content apparently. @amin has done something to search these dumps, and I’d not mind having that separate.)