There has absolutely not been a better opportunity in this century to build a successful new search engine for the internet. The dominant players are running full speed to make their offerings worse, and that’s after *years* of complaints that Google’s results are decaying.
@anildash opportunity arrives dressed as a set of complacent market participants
@anildash It's one of the few problem spaces that tempt me to gather a team to make sw. So far, building machines in my garage has kept me distracted from the fray. *whew*
@anildash Just the other day I was daydreaming about how Glitch could enable the niche-iest of niche search engines by providing a sweet query library and support for an open crawler's index.
@anildash no, cloudflare and other anti-bot products make it largely impossible for new entrants. Google/bing crawlers are whitelisted
@stephenjudkins @anildash or it’s the other way around — SEO spam of both human and increasingly AI origin, makes crawling a rather pointless exercise.. which is what we’re seeing.

@anderseknert @stephenjudkins @anildash This. I am fairly sure search is kinda done for. We'll have to go to the library/curated places for information. Oh, and stores will have to become curated too, sell relatively few products that they actually know something about.

Good news on many levels.

Will Chatbots Kill the Search Star?

Everyone it seems is playing with chatbots, and search engines are looking...

@Mojeek @thelovebing @stephenjudkins @anildash this isn’t about search engines incorporating AI but about AI creating SEO optimized crap content all over the web, rendering crawlers useless for finding anything of value.
@anderseknert @thelovebing @stephenjudkins @anildash Understood. It's about both. Both eroding the web. A few closed "brains" is the next assault on the web, building and following the few walled gardens assault
@Mojeek @anderseknert @thelovebing @stephenjudkins @anildash whoever stays a:
"Here's a list of places relevant to your search query where you may find the answer" - search engine and not a:
"This is a story all about how your query got twist-turned upside down and I'd like to show an answer just sit right there it might not be fully accurate but I don't care" Chatty AI system wins my loyalty to be honest
@paul @anderseknert @Mojeek @stephenjudkins @anildash Yeah, but with the Internet filled to the brim with (AI generated) SEO shite it really doesn't matter, that's my point. Chances of finding something useful in a pile of dung are slim regardless of what shovel you use.
@thelovebing @anderseknert @stephenjudkins
@Mojeek @anildash You're very right, but as soon as I sense AI generated content (normally after a couple of sentences) I go back and mentally block that base url

I'm happy to put in that extra load for my own wellbeing.

Hmmm... those sites should have a "bot" icon in search results perhaps 🤣
@thelovebing @anderseknert @stephenjudkins @anildash that means crawlers are over. Good. Federation, publish and subscribe, syndication, webs of trust can enable better search than they ever could have.
@stephenjudkins
Somehow @Mojeek does it anyway.
@anildash
@older @stephenjudkins @anildash it is not easy, but crawling is a smaller challenge than indexing/ranking billions of pages. Anyway, we are whitelisted on Cloudflare and many other WAFs
@anildash I agree, but it's also important to discuss how the web has become so much harder to search properly. Content mills, click farms, and soon LLM generated content, create an extremely hard 'wheat from chaff' problem.
@scottjenson @anildash only a couple hours ago I was searching for the answer to something via Google and the top several choices were all SEO weaponized word salad where I had to scroll through forty CPU-crippling ad-embeds to get to the actual poorly-written-keyword-stuffed answers … finally found a straightforward post about ten choices down in the results, with the answer in the first paragraph.
They created a human-centipede-ouroboros of information.
@Andrewhinton @scottjenson @anildash This is why most of my queries regarding issues end with reddit now as a keyword. Even Quora used to be good at finding good solutions. But their ranking system ruined how people started writing.
@scottjenson @anildash the way i see it is that it isn't as hard of a problem if you consider how much of the internet just doesn't need to be indexed anymore. tank or even refuse to index anything with an excess of trackers or just too much javascript and you've weeded out more than half the trash. this would be easy to circumvent, but then it would undo the capitalist machinations of those sites to begin with, so they won't. I think anything that gets users will be like the fediverse; a more boutique solution that will work for the users it works for
@chrisisgr8 @anildash Until you actually build a search engine, I gently suggest you avoid starting statements with "it isn't that hard of a problem"
@scottjenson Lol that's a good point, none the less I think the idea that a search engine doesn't necessarily need to see the entire internet at this point would be one good starting point

@chrisisgr8 @scottjenson @anildash I think that still represents a considerable amount of the #web to crawl, even once the chaff is removed.

But that would certainly be interesting. The #SearchEngine would /need/ to be offered non-commercially, because anything else guarantees eventual #enshittification.

@scottjenson @anildash and that's when they'll come up eith a premium subscription service to give you a 'pure' search experience

@Nick @anildash

No, the search engines really want to give you good content! It's just that there is just 100x more crap out there (due to the influence of ads)

The secret to a better search engine is to figure out a better web: an alternative to the race to the bottom hell that comes from ad clicks (waves hands)

@anildash I am thoroughly enjoying paying for search, though I realize it's a privilege. Neva is an excellent option.
@anildash I’ve used DuckDuckGo exclusively for more than a decade and have no complaints. Also use their browser if I’m on an Android device. I don’t see DDG racing to ChatGPT-ifying their service. What sorts of features do you expect from a new search engine?
@nadezhda04
@anildash Read somewhere recently their main results output comes from Bing? If Bing is throwing itself into the toilet I worry DDG will simply pass that through.
@anildash here’s hoping. Massive disruption is about to break open. Great time to do this!
@anildash Hoping for an epic comeback from Altavista.
@anildash I'm a little confused. Maybe I'm misunderstanding the alternatives. I use DuckDuckGo and have been really happy with it. Is that still relying on one of the "big players" under the hood?
@anildash it takes years to do that! But Neeva may have perfect timing. The product was pretty good last I looked.
@anildash Hey Bing, how many people live on the moon?
@anildash if Yahoo was ever going to make a comeback, now’s the time

@anildash
Google: Meh. Chrome dominates the schools, screw that death spiral.

Bing: Skeptical of the AI search engine; I hate corporate oligarchs. F*ck them.

Opera: Considering. May choose regular, but also try GX edition.

Mozilla Firefox: Considering, but pervious experiences (outside of that MS Vista laptop I attempted to restore) were slightly dysfunctional to put it someway.

Tor: Now THAT'S the dark web!

Anything else I should put on the list? I'd like to expand my browser horizons.

@anildash at the same time - in our ghoulish world of race-to-the-bottom capitalism, is there truly a market for such a thing?

like the obvious answer is YES but keep in mind, the search engines are shit because the EXPECTATION is that the relentless march of SEO will squeeze more profit out of end users but the REALITY is that the content is being generated by algorithms TO SATISFY OTHER ALGORITHMS

we are not the market for search engines; marketing is the market for search engines

@anildash like who's going to pick up on the sales pitch "we're going to make a search engine that makes it easier to actually find the answers people want, without all the SEO garbage, and we're going to compete with Google"? every piece of that pitch is infinitely appealing to US, the people who have to suffer through our broken search engine ecosystem, but not to the money people who want "eyeballs" and "exposure," for whom the actual UTILITY of search engines is a non-consideration

@anildash and there's a final thing to consider:

how exactly will Google and Microsoft respond to the existence of a competitor who threatens to upend their self-devouring search engine ecosystem with a product that, you know, actually WORKS?

it wouldn't even need to be a THREAT (god knows Google and Microsoft would win by default in any competition of brand recognition, even if Bing is a distant second); they have the wherewithal to annihilate any POTENTIAL competition without issue

@anildash like i'm sorry to be doomer and defeatist on this but we're seeing, on this front, the inevitable end result of capitalism in this space

tech designed not for people, but for shareholders

@anildash I switched to Kagi full time about 6 months ago and haven’t looked back. Tried you.com before that. DuckDuckGo has never had a good UI for my brain.
@dsully @anildash They describe themselves as being “snappy” 
@anildash Agreed. DuckDuckGo runs off of Bing and Startpage off Google. My understanding is that Brave Search runs their own search for the most part. I've been using it for months and I rarely feel the need to check with the other search engines.
@anildash The perverse incentives of advertising and spam make it so we can’t have nice things.
@anildash My favorites are Searx, Kagi, Neeva, and BlogSurf.
@katmmoss Just checked out blogsurf and it is awesome! Thanks for the recommendations.
@katmmoss How do you make Searx functional? It seems like an ideal solution, but it fails the simplest of searches. The different engines will report various errors such as "timeout", "too many requests" and "CAPTCHA failed"
@IonicDriver unfortunately, yes. Because the majority of the main stream search engines require those sorts of things, and the application is unable to solve them.
@anildash I really wanted DDG to be the one but now it's frustrating to use, ignoring words and suggesting results that have nothing to do with my search, there is a need for better engine

@anildash Why do you think it's a solvable problem?

Once a search engine catches on, the reward for gaming it becomes functionally infinite.

(There's the AWS line about any online service is really just a distributed denial of service attack you asked for, too. The minimum resources required are not small.)

What we're seeing is an unstable system that pushes any successful search engine -- meaning it works well enough that people use it -- from "cooperate" to "defect" because it is used.

@anildash If you want a stable system, you have to build something where "defect" isn't an option.

There's no way to do that with a global distributed system that's free at the point of use.

You can conceivably have a pay-to-use index, but that's going to start involving a quality measure and algorithmic quality measures aren't available. (The proxies are what gets gamed.)

@anildash Some competitors decided to copy Google and not more. So, today we have Google and a bunch of other little worse Googles.
@anildash If you don't mind scrolling through seven pages of ads, they kill it on page 8.
@anildash maybe Neeva? But I don't think their business model makes any sense

@anildash Something I think about is whether the search mode we learned and retain as deep muscle memory — remembering idiosyncratic fragments and hoping to be guided back to them — is now obsolescent, and a decade of decline has led to users expecting fuzzy mediocrity from web search. And of course, any new product has to meet users where they are in order to succeed.

(‘Made it through AltaVista’ mode should always be there as an option, though.)

@anildash My archivist head is thinking of the morass of search in 97-99 before Google asserted itself, but it’s almost purely social history. You can describe the terrible search experience and terrible browser experience in those years but it’s so hard to demonstrate it. (To me, the evolution of late-90s personal websites into early linklogs / blogs reflects a context of really crappy web search and the ever-increasing pain of site redesigns.)