@strypey @zeh @alcinnz

Indeed mojeek.com is not the only non-tech-giant #searchEngine (+crawler). From strypey’s mentions + some of my notes:

* #Mojeek ← does their own crawling
* #Metager.org ← does their own crawling
* #SearchMySite.net ← avoid (Cloudflare)
* #Searx ← just proxy software, many instances
* #4get ← another proxy software, about a dozen instances: https://4get.ca/instances
* #Gigablast ← does their own crawling, but what happened?.. they were dissolved last year & seem to now be www.alltheinternet.com
* #Ombrelo ← a proxy but more advanced than the others (filters/downranks Cloudflare sites)

#YaCy is notable because it’s a crawler that you can install and operate yourself. YaCy instances can be public-facing and they can also share indexes with each other fedi style apparently. Some Searx instances tap YaCy instances.

I would love to find a searx or 4get instance that rejects the tech giants, but aggregates from YaCy, mojeek, gigablast, metager, maginalia.nu, frogfind.com, & wiby.me.

And I would love it even more if it would make replacements:

* #StackExchange#AnonymousOverflow
* #YouTube#Invideous
* #Medium.com → scribe.rip
* #BBC → BBC’s onion site
* #NYTimes → New York Times’s onion site
* etc.

search.fabiomanganiello.com makes some of those replacements.

Instance browser

4get: Instances

Introducing the redesign and first major open source contributor

The searchmysite.net redesign has been launched, thanks to its first major open source contributor Lucas Gramajo.

@zeh @alcinnz I've been experimenting with a few search engines recently, including Mojeek, Metager.org, SearchMySite.net and the Searx instances at Monocles.de.

I've also figured out to how to configure my Firefox-based mobile browser to search for videos directly on YewTube (Invidio.us instance), and for green tech info on Appropedia.org, and I plan to add more sites to this (eg search.joinpeertube.org).

#Search #SearchEngines #Mojeek #Metager #SearchMySite #Searx #Monocles

As it turns out: limiting a queries result based on a certain condition based on previous results without breaking a potential paging algorithm is hard …

/me wonders how #searchmysite solved that problem …

EDIT: … Searchmysite uses Apache solr which can already do this

Pleroma

Conoscete Searchmysite?

E' un motore di ricerca alternativo per siti web personali e indipendenti.

https://blog.getprivacy.it/searchmysite

#searchmysite #indiesearch #motoridiricerca #sitipersonali #hobby

Searchmysite

searchmysite.net è un motore di ricerca alternativo per siti web personali e indipendenti. È open source ed è stato creato per hobby da...

Lost in privacy
Been a while, but I've just posted a short #searchmysite progress update: https://blog.searchmysite.net/posts/progress-update-q1-q2-2021/ . Short summary is that I've been reviewing submissions daily, fixed some minor bugs, and the system has been stable, but I've not had chance to do any enhancements. One interesting thing I've noticed is that I hardly get any hits from Google: anyone have any idea why?
Progress update Q1 & Q2 2021

This is just a quick update on progress since the last post on 30 Jan 2021.

I've summarised how much searchmysite.net (the open source search engine and search as a service) has cost over the past 6 months, and estimated how much it may cost to keep going in future: https://blog.searchmysite.net/posts/searchmysite.net-the-delicate-matter-of-the-bill/ Short summary: it is (perhaps) surprisingly expensive to run a search engine and search as a service. The good news is that there is a plan to cover costs (without resorting to advertising). Let's see if it works. #searchmysite
searchmysite.net: The delicate matter of the bill

searchmysite.net is a bootstrapped side-project, so currently receives no external funding. This post contains a quick review of current and expected future running costs, along with a summary of the plan to pay these running costs.

I decided to expand my last toot about advertising and search engines into a full blog post: https://blog.searchmysite.net/posts/advertising-and-search-engines-when-it-is-okay-to-mix-and-when-it-is-not/ #searchmysite
Advertising and search engines: When it is okay to mix and when it is not

One of the key points in my last post was the suggestion that there should be a search engine which downranks pages containing adverts. While this attracted a lot of positivity, it also unfortunately got some negativity, so I thought I'd write a quick post to clarify.

searchmysite.net is now open source: https://blog.searchmysite.net/posts/searchmysite.net-is-now-open-source/ . Post includes: Why aren’t other search engines open source? What open source licence is it? What are the future plans? #searchmysite
searchmysite.net is now open source

The source code is at: https://github.com/searchmysite/searchmysite.net/. Topics in this post include: Why aren't other search engines open source? What open source licence is it? and What are the future plans?