Everyone should immediately stop contributing to the stack overflow and its network. The human touch is what made it unique. Delete your profile from SO AND all your answers. Freeloaders are making money out of human contributions.
OpenAI, Mircosoft, and Google together kill the open web. Thousands of independent blogs and forums are now nowhere in search engines or pushed back to page two to support their AI and partnerships with Reddit, StackOverflow, and more. Many humans contributed to these sites hoping to build a knowledge base for humanity, but now greedy people like Sama and OpenAI are taking over everything.
@nixCraft Is there any way to find these sites? You are correct in that they disappeared but there must be a way to find them. Hopefully 🀞
@Pajo_16 @nixCraft On page two. On page 5 you can find bad SEO optimizing agencies.
@Pajo_16 @nixCraft Maybe the Internet Archive (https://archive.org)?
Internet Archive: Digital Library of Free & Borrowable Texts, Movies, Music & Wayback Machine

@Pajo_16 @nixCraft I've heard people saying that they're adding "before:2023" to their searches to filter out a lot of the ai garbage. Will only help as long as the search topic is somewhat timeless, obviously.
@marcusdeh @nixCraft
Folks, thanks for all the suggestions. It's appreciated.

@marcusdeh @Pajo_16 @nixCraft Even that seems shaky these days - I'll try putting date restrictions on searches and regularly get stuff that the results page says is a week old, but actually dates from 2011. (Or vice versa.)

I'm not sure if they're just not respecting search syntax, something's breaking on the search engines' side of things, or if people are figuring out ways to make pages appear to be a different age than they actually are.

@Pajo_16 @nixCraft I can recommend the search engine #Kagi. They have a different index, one can put a weight on specific domains and they have a project called small web to randomly find those gems.
@ChristianKrebel @Pajo_16 @nixCraft isn't Kagi in on the AI bullshit? I wouldn't trust them not to screw everyone over once they get popular enough.
@p4 @Pajo_16 @nixCraft well they have integrated AI, but more on action. Most of the time you will have to trigger an AI feature yourself. Also, they have an assistant where you can choose the models you want to use (APIs are more anonymous) and the answers will have sources from their index which is the best of both worlds IMHO.

@Pajo_16 @nixCraft

If you're looking for blogs and other personal sites I recommend bookmarking these search engines:
- https://search.marginalia.nu/
- https://ichi.do/
- https://clew.se/
- https://searchmysite.net/
- https://wiby.me/

Would also recommend checking out this very excellent piece as well for alternative search options: https://seirdy.one/posts/2021/03/10/search-engines-with-own-indexes/

Marginalia Search

search.marginalia.nu is a small independent do-it-yourself search engine for surprising but content-rich websites that never ask you to accept cookies or subscribe to newsletters. The goal is to bring you the sort of grass fed, free range HTML your grandma used to write.

search.marginalia.nu

@jeruyyap @Pajo_16 @nixCraft Support the micro-brew search engines that are trying to make Search Engines Not Suck Again!

(new slogan #SENSA)

@jeruyyap @Pajo_16 @nixCraft

Happy to be mentioned! ;)

Clew is very beta at the moment but I've just started my summer break so I should have some good time for dedicated work on it. :)

@jeruyyap @Pajo_16 @nixCraft Kagi has a great β€œsmall web” search filter too
@jeruyyap @Pajo_16 @nixCraft I've captured a bunch of search engines and other sites dedicated to exploring the #IndieWeb here https://shellsharks.com/indieweb#explore-the-indieweb
IndieWeb Assimilation

An introduction to the IndieWeb, with a lot of bonus resources. Includes lists of interesting webrings, IndieWeb search engines, slash page directories, hosting platforms, and an assortment of other delightful things from across the human web.

shellsharks

@Pajo_16 @nixCraft

Blogrolls are a great option. :)

Mine's here if you want a good starting place: https://benjaminhollon.com/blogroll/

Then many of those sites have their own blogrolls; and so on and so on and so on.

Blogroll

Some things I follow from around the web.

@Pajo_16

> Is there any way to find these sites?

One alternative, independent search engine is #Mojeek that has its own index, using that you may be able to find things that Google/Microsoft decided to remove from their search results: https://www.mojeek.com/
@nixCraft

Mojeek

Mojeek is a web search engine that provides unbiased, fast, and relevant search results combined with a no tracking privacy policy.

@nixCraft
Take a look at #Facebook for 10 minutes and you'll see what a true #AI web looks like. It's a barren wasteland that no one interacts with. Let them have their fun with these sites knowning its just going to be bots talking to AI. #google search is whats dead not the #openweb.

@Ryan @nixCraft

What search do you use?

I like(d) DuckDuckGo, but with bing in the background it is getting worse.

@earl @nixCraft I have been using #searxng which combines results from multiple sources.
@nixCraft The smaller, human-run web needs to come back via Mastodon, old-school forums and others. Let the big companies have their AI-powered Dead Internet, primarily away from the rest of us.
@nixCraft this is honestly why I’ve lost interest in #tech, #computers and the #internet as a whole. It’s sad, I used to find it all so intriguing to learn and I used to even make videos about it. I really just don’t care anymore, it feels impossible to find anything organic these days.
@blakehensley @nixCraft i miss when "AI" meant like the way in which a video game character walks around the map or something..

@nixCraft

Tech corporations are strip-mining the commons in every possible way. It's despicable 😑

What will they do when they've finished this process? What will be left? It's unsustainable in the long term.

https://en.wikipedia.org/wiki/Surface_mining

Surface mining - Wikipedia

@nixCraft it is a common problem, the same problem that burnt Alexandria Library to ashes, the knowledge is made a commodity because it is invaluable and its trade can be a potential golden egg goose. The question is that the destruction of the open knowledge will not detain the multiplication of the cognaisance. This because of the works that are placed into permissive ownership rights like Arte Libre, Creative Commons and many many other, do you think that don't have a heavy lobby for the banishment of the permissive ownership? Sure thing that is existing. But on the other hand we have a heavy lobby to make knowledge commonly available for masses. And this is complex, because every single bit of data that we put into internet is subject to data mining, Business Intelligence and many other things that may be violating the good faith of the permissive ownership.
Let's start talking about crawling? The way our data is indexed to be put on the major search engines is immoral. A robot comes and drains all the essence of your site, if you don't have a meta defined to make a approximated description that is not that accurate. So I think that the way that the internet is conceived is a little bit hostile to our data rights
@nixCraft WTF, there's no end to this shit... 😐
@wurzelmann @nixCraft
Maybe we all need to move to #GeminiSpace and let AI have the old web.
@nixCraft
So sad to see them selling the collective soul

@nixCraft
Edit: As pointed out by others, still delete your answers as a form of protest if you wish. OpenAI may still get the data, but it will harm SO.

Edit 2: welp, looks like that might be off the table either way
https://m.benui.ca/@ben/112396505994216742

to be completely fair, I would be incredibly surprised (and I am trying to be charitable due to lack of concrete evidence) if OpenAI hasn't scanned every single SO question and answer ever made already. This was probably made so they would have ChatGPT answers on popular questions and stuff like that, which of course is still bad

ben πŸ‡΅πŸ‡Έ ui (@[email protected])

Attached: 2 images Stack Overflow announced that they are partnering with OpenAI, so I tried to delete my highest-rated answers. Stack Overflow does not let you delete questions that have accepted answers and many upvotes because it would remove knowledge from the community. So instead I changed my highest-rated answers to a protest message. Within an hour mods had changed the questions back and suspended my account for 7 days.

benui mastodon instance
@chickfilla @nixCraft
This for sure
@karrbs @chickfilla @nixCraft but now they have made it legal.
@dbread @karrbs @nixCraft I think what you meant to say is that they are actively endorsing OpenAIs involvement πŸ˜…, which I guess sure might mean that SO won't go after them, though I don't know if SO has the rights to pursue people on behalf of their users

@chickfilla @nixCraft

That's not the point. Going forward stack overflow will be polluted with a bunch of AI "hallucinated" garbage, where hallucinated means "made shit up in order to produce a plausible answer".

@artemesia @nixCraft well, yes, that's what I meant with my last sentence. The point I was trying to make is that the data collection aspect of this would happen regardless, and if you want to be more cynical about this, there's nothing stopping SO from keeping your data after you delete your account. Though if your answers become unavailable on the site after doing so, that would be a reason why since it would hurt the site (aside from the obvious reason of not wanting to be associated with SO ofc)
@chickfilla @nixCraft I know, right? talk about obvious. Does it filter out the toxic snrk ;)

@chickfilla

StackOverflow dumps have been available to everyone for a long time.

https://stackoverflow.blog/2022/10/20/introducing-the-overflow-offline-project/

Introducing the Overflow Offline project - Stack Overflow

@chickfilla @nixCraft when you post something to Stack Overflow, you are licensing it with a Creative Commons license.

This open license is explicitly meant to facilitate sharing of knowledge and does not require permission from the author.

When someone decides to release content using an open license (which is great), they can't really complain when other people take advantage of said license.

I shared several of my programs as open source software. I won't get mad if people use them.

@lazza @nixCraft Likewise, if I release my contribution out in the open and then I remove it, regardless if someone has a copy of it or not, I have the right to do so.

Nobody is arguing they shouldn't, nor that they can't. This is more about boycotting SO. Just because you can do something, it doesn't mean you should, and more importantly, it doesn't mean you can't be criticized for it.

@lazza @chickfilla @nixCraft Creative Commons (aside from CC0) also requires attribution for derivative works. An LLM trained on CC material does not attribute its sources when it’s invoked. So it’s not compliant.

This is simple licence washing, and they get away with it because people let them.

@rubenerd @chickfilla @nixCraft the press release states that:

"This integration will [...] provide attribution to the Stack Overflow community within ChatGPT"

This relates to one side of the agreement (ChatGPT). The other product involved (OverflowAI) has this screenshot on its website.

If this is real, I would argue that attribution is being provided.

@nixCraft

Real Talk: if Stack Overflow dies, we'll all be out of our tech jobs. The most common questions can't be answered by reading manpages.

@phaysis @nixCraft while I agree this would hurt a lot of developers, I don't think it's a healthy mindset to have. If your job depends on Stack Overflow answers to be done right, then you probably do need to spend more time reading manuals.

Sure most common questions are not directly answered by manuals (and sure many man pages are not very helpful) but that's because they are not meant for that. Ideally you should arrive at your answers by getting a better understanding of the systems you are trying to work with. It usually takes more time, but it also leads to a more rewarding experience that pays up more in the long term.

Not to say there isn't a place for forums, after all there's times where we don't even know where to start looking, but if your job depends on readily available answers to very specific questions scattered through a site, you might be doing it wrong imho.

@nixCraft

Lets be real they did it already long time ago.
This is just to make it "legal".
Their answers are still bad

@nixCraft Or, if you are feeling mischievous, get a group of friends together and invent a nonsense programming language and then ask and answer questions about it that sound right but are utter boffo.
@nixCraft As we all know #Boffo, has had libraries for intra-operability with most major languages since, 2.14.
@nixCraft Stack Overflow uses a copyleft license for its answers. Under that same logic, Wikipedians should start deleting the pages they contributed on so that AI can't use them as training data... but then again, that's in the scope of what copyleft is supposed to allow in the first place
@csolisr @nixCraft That makes little sense.
@csolisr @nixCraft Strong or weak copyleft? That makes a big difference.

@csolisr @nixCraft

If you use copyleft content that requires derivate works to be equally licensed to train your IA, your IA is a derived work, so your IA should be distributed under that copyleft.

Are most IAs a massive copyright and copyleft violation? It's pretty much obvious. When will those copyright and copylefts be enforced? When somebody strong or brave enough decides to sue any of the main IA developers.

Wikipedia and Stack Overflow content included.

@csolisr

You're right. The license is designed so that we cannot revoke the usage of our answers.

Deleting a profile won't do anything.

I've ceased to contribute to Stack Overflow a long time ago.

The AI thing is just the latest of a long line of enshittification of that platform.

@nixCraft

@nixCraft I am done for, deleting all my posts, gl with that SO
@clot27 pretty sure your posts will still exist internally to be mined by LLM engines.
@nixCraft
@jenesuispasgoth @nixCraft yea but would be harder to fetch ig