With the Reddit thing, it'd be easy to miss what's going on on StackOverflow. Basically:

1. Site owners allowed LLM-generated content.

2. Mods are on strike.

3. Also, back in March it turns out SO turned off the Creative Commons data pipe, which backs up the site to the Internet Archive, in an attempt to confound using SO for training data.

https://www.vice.com/en/article/4a33dj/stack-overflow-moderators-are-striking-to-stop-garbage-ai-content-from-flooding-the-site

https://meta.stackexchange.com/questions/390106/moderation-strike-update-data-dumps-choosing-representatives-gpt-data-and-wh

Stack Overflow Moderators Are Striking to Stop Garbage AI Content From Flooding the Site

Volunteer moderators of the forum are striking over a policy that says AI-generated content can practically not be moderated.

@mttaggart Great breakdown, thank you!
@mttaggart The Internet is going through some rough changes.
@mttaggart what is going on with these sites lately? People have lost their minds.

@mttaggart There is an open source alternative to SO people should consider:
https://codidact.com/

Non-profit behind it:
https://codidact.org/

Codidact

@viktor Thanks! I've been looking for a noncommercial SO alternative for some time now. @mttaggart
@viktor ooh, does it federate? 👀 (not sure if I want it to, haha)
@lukas No, I don't think it is federated.
GPT on the platform: Data, actions, and outcomes

In a meeting with some moderators last week, I committed to releasing the data sets from our initial studies around the efficacy and false positive rates of ChatGPT detectors to them. Tuesday after...

Meta Stack Exchange
@mogul I believe this is addressed in the post I linked. Specifically that the methodology is suspect.
@mttaggart Oh I missed your second link! Sorry about that.
@mttaggart Did all the tech billionaires just catch the same virus?
@mastodonmigration I would venture that the same brainworms that lead one to be morally bankrupt enough to become a billionaire also lead to this behavior.
@mttaggart @mastodonmigration I think the VC oligarchs are putting as you say "brainworms" of fear and greed into those they might have bailed out in the past.
@mastodonmigration they had it for a long time already. It just wasn’t profitable to show that. @mttaggart
@mastodonmigration @mttaggart Remeber that bank crash a few months back… could be echos of SVB being put into production.

@mttaggart

Maybe there needs to be a CC variant license that doesn't allow use in training data.

@mttaggart @SecurityWriter it would seem that unreal2's flak cannon has come to life as llm technology, and its resulted in hot molten scrap metal being blasted around the internet

@mttaggart it's an interesting battle between the AI companies and other tech companies/organisations with a lot of casualties.

AI companies make a ton of money and scrape the entire internet for training material.

Tech companies see an increase in cost for infrastructure due to scraping and a lot of opportunity cost due to the AI financial bubble.

The casualties are users, their content and everyone using APIs.

Which is weird. It's all a battle about money.

@mttaggart I am sympathetic to this move.