Mastodawn

raisincookie May 15, 2024

Everyone should immediately stop contributing to the stack overflow and its network. The human touch is what made it unique. Delete your profile from SO AND all your answers. Freeloaders are making money out of human contributions.

Show thread

nixCraft 🐧May 6, 2024

OpenAI, Mircosoft, and Google together kill the open web. Thousands of independent blogs and forums are now nowhere in search engines or pushed back to page two to support their AI and partnerships with Reddit, StackOverflow, and more. Many humans contributed to these sites hoping to build a knowledge base for humanity, but now greedy people like Sama and OpenAI are taking over everything.

Show thread

Pajo_16 May 6, 2024

@nixCraft Is there any way to find these sites? You are correct in that they disappeared but there must be a way to find them. Hopefully 🤞

Show thread

sigi714 May 6, 2024

@Pajo_16 @nixCraft On page two. On page 5 you can find bad SEO optimizing agencies.

Show thread

ilikecats May 6, 2024

@Pajo_16 @nixCraft Maybe the Internet Archive (https://archive.org)?

Internet Archive: Digital Library of Free & Borrowable Texts, Movies, Music & Wayback Machine

Show thread

marcus 😷🧂May 6, 2024

@Pajo_16 @nixCraft I've heard people saying that they're adding "before:2023" to their searches to filter out a lot of the ai garbage. Will only help as long as the search topic is somewhat timeless, obviously.

Show thread

Pajo_16 May 6, 2024

@marcusdeh @nixCraft
Folks, thanks for all the suggestions. It's appreciated.

Show thread

P Stewart May 6, 2024

@marcusdeh @Pajo_16 @nixCraft Even that seems shaky these days - I'll try putting date restrictions on searches and regularly get stuff that the results page says is a week old, but actually dates from 2011. (Or vice versa.)

I'm not sure if they're just not respecting search syntax, something's breaking on the search engines' side of things, or if people are figuring out ways to make pages appear to be a different age than they actually are.

Show thread

Christian Krebel ⁂May 6, 2024

@Pajo_16 @nixCraft I can recommend the search engine #Kagi. They have a different index, one can put a weight on specific domains and they have a project called small web to randomly find those gems.

Show thread

P4 May 6, 2024

@ChristianKrebel @Pajo_16 @nixCraft isn't Kagi in on the AI bullshit? I wouldn't trust them not to screw everyone over once they get popular enough.

Show thread

Christian Krebel ⁂May 6, 2024

@p4 @Pajo_16 @nixCraft well they have integrated AI, but more on action. Most of the time you will have to trigger an AI feature yourself. Also, they have an assistant where you can choose the models you want to use (APIs are more anonymous) and the answers will have sources from their index which is the best of both worlds IMHO.

Show thread

Dianora (Diane Bruce)May 6, 2024

@Pajo_16 @nixCraft Good old fashioned webrings.

Show thread

Jeremy Yap May 6, 2024

@Pajo_16 @nixCraft

If you're looking for blogs and other personal sites I recommend bookmarking these search engines:
- https://search.marginalia.nu/
- https://ichi.do/
- https://clew.se/
- https://searchmysite.net/
- https://wiby.me/

Would also recommend checking out this very excellent piece as well for alternative search options: https://seirdy.one/posts/2021/03/10/search-engines-with-own-indexes/

Marginalia Search

search.marginalia.nu is a small independent do-it-yourself search engine for surprising but content-rich websites that never ask you to accept cookies or subscribe to newsletters. The goal is to bring you the sort of grass fed, free range HTML your grandma used to write.

search.marginalia.nu

Show thread

Third spruce tree on the left May 6, 2024

@jeruyyap @Pajo_16 @nixCraft Support the micro-brew search engines that are trying to make Search Engines Not Suck Again!

(new slogan #SENSA)

Show thread

Amin Hollon 🏳May 6, 2024

@jeruyyap @Pajo_16 @nixCraft

Happy to be mentioned! ;)

Clew is very beta at the moment but I've just started my summer break so I should have some good time for dedicated work on it. :)

Show thread

Paul McBride May 6, 2024

@jeruyyap @Pajo_16 @nixCraft Kagi has a great “small web” search filter too

Show thread

shellsharks May 7, 2024

@jeruyyap @Pajo_16 @nixCraft I've captured a bunch of search engines and other sites dedicated to exploring the #IndieWeb here https://shellsharks.com/indieweb#explore-the-indieweb

IndieWeb Assimilation

An introduction to the IndieWeb, with a lot of bonus resources. Includes lists of interesting webrings, IndieWeb search engines, slash page directories, hosting platforms, and an assortment of other delightful things from across the human web.

shellsharks

Show thread

Amin Hollon 🏳May 6, 2024

@Pajo_16 @nixCraft

Mojeek

Mojeek is a web search engine that provides unbiased, fast, and relevant search results combined with a no tracking privacy policy.

Show thread

Expert Plus 🍀🔱May 6, 2024

@nixCraft *Microsoft

Show thread

Kasion May 6, 2024

@nixCraft
Take a look at #Facebook for 10 minutes and you'll see what a true #AI web looks like. It's a barren wasteland that no one interacts with. Let them have their fun with these sites knowning its just going to be bots talking to AI. #google search is whats dead not the #openweb.

Show thread

Earl Novy May 7, 2024

@Ryan @nixCraft

What search do you use?

I like(d) DuckDuckGo, but with bing in the background it is getting worse.

Show thread

Kasion May 7, 2024

@earl @nixCraft I have been using #searxng which combines results from multiple sources.

Show thread

Pep May 6, 2024

@nixCraft The smaller, human-run web needs to come back via Mastodon, old-school forums and others. Let the big companies have their AI-powered Dead Internet, primarily away from the rest of us.

Show thread

Blake Hensley May 6, 2024

@nixCraft this is honestly why I’ve lost interest in #tech, #computers and the #internet as a whole. It’s sad, I used to find it all so intriguing to learn and I used to even make videos about it. I really just don’t care anymore, it feels impossible to find anything organic these days.

Show thread

Li ~ Crystal System May 7, 2024

@blakehensley @nixCraft i miss when "AI" meant like the way in which a video game character walks around the map or something..

Tech corporations are strip-mining the commons in every possible way. It's despicable 😡

What will they do when they've finished this process? What will be left? It's unsustainable in the long term.

https://en.wikipedia.org/wiki/Surface_mining

Surface mining - Wikipedia

Show thread

Oliver Vettra-Morrigan May 12, 2024

@nixCraft it is a common problem, the same problem that burnt Alexandria Library to ashes, the knowledge is made a commodity because it is invaluable and its trade can be a potential golden egg goose. The question is that the destruction of the open knowledge will not detain the multiplication of the cognaisance. This because of the works that are placed into permissive ownership rights like Arte Libre, Creative Commons and many many other, do you think that don't have a heavy lobby for the banishment of the permissive ownership? Sure thing that is existing. But on the other hand we have a heavy lobby to make knowledge commonly available for masses. And this is complex, because every single bit of data that we put into internet is subject to data mining, Business Intelligence and many other things that may be violating the good faith of the permissive ownership.
Let's start talking about crawling? The way our data is indexed to be put on the major search engines is immoral. A robot comes and drains all the essence of your site, if you don't have a meta defined to make a approximated description that is not that accurate. So I think that the way that the internet is conceived is a little bit hostile to our data rights

Show thread

Wurzelmann May 6, 2024

@nixCraft WTF, there's no end to this shit... 😐

Show thread

Sector9 May 6, 2024

@wurzelmann @nixCraft
Maybe we all need to move to #GeminiSpace and let AI have the old web.

Show thread

Keev May 6, 2024

@nixCraft
So sad to see them selling the collective soul

Show thread

drevil May 6, 2024

@nixCraft
Edit: As pointed out by others, still delete your answers as a form of protest if you wish. OpenAI may still get the data, but it will harm SO.

Edit 2: welp, looks like that might be off the table either way
https://m.benui.ca/@ben/112396505994216742

to be completely fair, I would be incredibly surprised (and I am trying to be charitable due to lack of concrete evidence) if OpenAI hasn't scanned every single SO question and answer ever made already. This was probably made so they would have ChatGPT answers on popular questions and stuff like that, which of course is still bad

ben 🇵🇸 ui (@[email protected])

Attached: 2 images Stack Overflow announced that they are partnering with OpenAI, so I tried to delete my highest-rated answers. Stack Overflow does not let you delete questions that have accepted answers and many upvotes because it would remove knowledge from the community. So instead I changed my highest-rated answers to a protest message. Within an hour mods had changed the questions back and suspended my account for 7 days.

benui mastodon instance

Show thread

Karrbs May 6, 2024

@chickfilla @nixCraft
This for sure

Show thread

dbread May 6, 2024

@karrbs @chickfilla @nixCraft but now they have made it legal.

Show thread

drevil May 6, 2024

@dbread @karrbs @nixCraft I think what you meant to say is that they are actively endorsing OpenAIs involvement 😅, which I guess sure might mean that SO won't go after them, though I don't know if SO has the rights to pursue people on behalf of their users

Show thread

Artemesia May 6, 2024

@chickfilla @nixCraft

That's not the point. Going forward stack overflow will be polluted with a bunch of AI "hallucinated" garbage, where hallucinated means "made shit up in order to produce a plausible answer".

Show thread

drevil May 6, 2024

@artemesia @nixCraft well, yes, that's what I meant with my last sentence. The point I was trying to make is that the data collection aspect of this would happen regardless, and if you want to be more cynical about this, there's nothing stopping SO from keeping your data after you delete your account. Though if your answers become unavailable on the site after doing so, that would be a reason why since it would hurt the site (aside from the obvious reason of not wanting to be associated with SO ofc)

Show thread

cognitively accessible math May 6, 2024

@chickfilla @nixCraft I know, right? talk about obvious. Does it filter out the toxic snrk ;)

Show thread

foxy May 6, 2024

@chickfilla

StackOverflow dumps have been available to everyone for a long time.

https://stackoverflow.blog/2022/10/20/introducing-the-overflow-offline-project/

Introducing the Overflow Offline project - Stack Overflow

Show thread

Andrea Lazzarotto May 6, 2024

@chickfilla @nixCraft when you post something to Stack Overflow, you are licensing it with a Creative Commons license.

This open license is explicitly meant to facilitate sharing of knowledge and does not require permission from the author.

When someone decides to release content using an open license (which is great), they can't really complain when other people take advantage of said license.

I shared several of my programs as open source software. I won't get mad if people use them.

Show thread

drevil May 6, 2024

@lazza @nixCraft Likewise, if I release my contribution out in the open and then I remove it, regardless if someone has a copy of it or not, I have the right to do so.

Nobody is arguing they shouldn't, nor that they can't. This is more about boycotting SO. Just because you can do something, it doesn't mean you should, and more importantly, it doesn't mean you can't be criticized for it.

Show thread

Ruben Schade

🔰 🇦🇺May 8, 2024

@lazza @chickfilla @nixCraft Creative Commons (aside from CC0) also requires attribution for derivative works. An LLM trained on CC material does not attribute its sources when it’s invoked. So it’s not compliant.

This is simple licence washing, and they get away with it because people let them.

Show thread

Andrea Lazzarotto May 8, 2024

@rubenerd @chickfilla @nixCraft the press release states that:

"This integration will [...] provide attribution to the Stack Overflow community within ChatGPT"

This relates to one side of the agreement (ChatGPT). The other product involved (OverflowAI) has this screenshot on its website.

If this is real, I would argue that attribution is being provided.

Show thread

Folix May 6, 2024

@nixCraft nooooooooooo

Show thread

Internet Rebel - ITPII May 6, 2024

@nixCraft Whut

Show thread

ShawnT 🔧🐀May 6, 2024

@nixCraft

Real Talk: if Stack Overflow dies, we'll all be out of our tech jobs. The most common questions can't be answered by reading manpages.

Show thread

drevil May 6, 2024

@phaysis @nixCraft while I agree this would hurt a lot of developers, I don't think it's a healthy mindset to have. If your job depends on Stack Overflow answers to be done right, then you probably do need to spend more time reading manuals.

Sure most common questions are not directly answered by manuals (and sure many man pages are not very helpful) but that's because they are not meant for that. Ideally you should arrive at your answers by getting a better understanding of the systems you are trying to work with. It usually takes more time, but it also leads to a more rewarding experience that pays up more in the long term.

Not to say there isn't a place for forums, after all there's times where we don't even know where to start looking, but if your job depends on readily available answers to very specific questions scattered through a site, you might be doing it wrong imho.

Show thread

jamesTown May 6, 2024

@nixCraft

Lets be real they did it already long time ago.
This is just to make it "legal".
Their answers are still bad

Matt 🔶 (LordMatt)May 6, 2024

@nixCraft Or, if you are feeling mischievous, get a group of friends together and invent a nonsense programming language and then ask and answer questions about it that sound right but are utter boffo.

Show thread

Matt 🔶 (LordMatt)May 6, 2024

@nixCraft As we all know #Boffo, has had libraries for intra-operability with most major languages since, 2.14.

Show thread

Carlos Solís May 6, 2024

@nixCraft Stack Overflow uses a copyleft license for its answers. Under that same logic, Wikipedians should start deleting the pages they contributed on so that AI can't use them as training data... but then again, that's in the scope of what copyleft is supposed to allow in the first place

Show thread

soc May 6, 2024

@csolisr @nixCraft That makes little sense.

Show thread

Larry Garfield May 6, 2024

@csolisr @nixCraft Strong or weak copyleft? That makes a big difference.

Show thread

Curioso 🍉 🇺🇦 (jgg)May 6, 2024

@csolisr @nixCraft

If you use copyleft content that requires derivate works to be equally licensed to train your IA, your IA is a derived work, so your IA should be distributed under that copyleft.

Are most IAs a massive copyright and copyleft violation? It's pretty much obvious. When will those copyright and copylefts be enforced? When somebody strong or brave enough decides to sue any of the main IA developers.

Wikipedia and Stack Overflow content included.

Show thread