@baldur Trust is all fine, but what counts is the license. In this case, our content is under CC-BY-SA. What seems relevant to me:
- using the content for AI training does (unfortunately?) not trigger the attribution requirement ("fair use", bla bla)
- it should be feasible to pull of a fork of Stack overflow, with a legal copy of all existing content
using the content for AI training does (unfortunately?) not trigger the attribution requirement
No matter what Creative Commons are hallucinating, SO/SE are sidestepping this entirely by relying on the commercial dual licence in the ToS which they sneakily did not limit to as needed to run the platform itself.
@pixelistik @baldur I don’t understand what you’re asking/saying?
There’s a public data dump of SO/SE under CC-BY-SA which people can use.
(Codidact have imported some sites early on but later found that untenable; active Q&A sites tend to work better if they only have their own, active, content apparently. @amin has done something to search these dumps, and I’d not mind having that separate.)