One of the things that the Stack Overflow brouhaha demonstrates is that it doesn’t matter if a service was founded by people trusted by the community (Atwood and Spolsky) and was broadly community-led. If it’s a VC-funded startup, they will sell out their users at some point.

@baldur Trust is all fine, but what counts is the license. In this case, our content is under CC-BY-SA. What seems relevant to me:

- using the content for AI training does (unfortunately?) not trigger the attribution requirement ("fair use", bla bla)
- it should be feasible to pull of a fork of Stack overflow, with a legal copy of all existing content

@pixelistik
If you can get the content. There's no obligation for them to transfer it to you. They could say you're violating terms of service for your bot crawling their site and shut you down no problem. They've enclosed a commons and will defend it.
@baldur
@dlakelan It seems that there is a dump file maintained by archive.org - enabled by exactly the CC license. https://archive.org/details/stackexchange
Stack Exchange Data Dump : Stack Exchange, Inc. : Free Download, Borrow, and Streaming : Internet Archive

This is an anonymized dump of all user-contributed content on the Stack Exchange network. Each site is formatted as a separate archive consisting of XML files...

Internet Archive

@pixelistik

Awesome, now we just have to figure out how to defend the archive.org from the statist bullshit when the FBI comes for them for violating copyright law. 😩

SEqlite