#searchengines #selfhost

I got tired of putting my trust search engines that run on somebody else's machine.

DuckDuckGo blocks all trackers, but it silently whitelists Microsoft's (because of the commercial agreement they have with Bing), except that they failed to mention it until they were caught with their hands in the jar.

Startpage is Dutch, it's notoriously sensitive on privacy and it's been around for longer than Google, but it's now been acquired by an ads company and I don't see things going well for it.

Brave claims to be the ultimate solution for privacy, except when they put some crypto miner to run in your browser so they can make a bit of extra money on the side.

So I've decided to take even this matter into my hands and run my own search engine. You can access it at https://search.fabiomanganiello.com. It runs SearXNG in a Docker container on one of my servers at home. It's a bit slower than major search engines, but not that much, and it's still a little price I'm ready to pay for freedom.

Search

SearXNG — a privacy-respecting, open metasearch engine

@blacklight Some searx instances are fed by a locally running #yacy which does the crawling, so you can also get a greater degree of independence from the giants. I noticed a lot of your results grab from both DDG & Qwant. Both DDG & Qwant are MS syndicates, so you could drop one of them and replace with #Mojeek or #Gigablast (which do their own independent crawling).

@koherecoWatchdog this is a trade-off that I'm very happy to discuss.

I've thought of dropping DDG (and maybe Brave) results entirely. I'm a big fan of Mojeek, but there's a trade-off between privacy purism and relevance of the results that I'm still trying to strike here.

As an example: a search of my name on Mojeek won't return any links to my content on LinkedIn, Medium, or CF-powered websites like Hackernoon, dev.to, IoT4All, BetterProgramming. Nor any links to my books (because they are sold on the likes of Amazon, eBay, ebooks.com or springer.com), nor any links to my music (because, outside of Bandcamp, it's on the likes of Spotify, Tidal etc.), nor any links to my past talks (because they are mostly on YouTube).

Yes, it reports the links to my self-hosted websites and my apps on F-Droid, but those are only a small fraction on the content about me on the web. And the same considerations that apply to searching my name apply to any other search that a user may want to perform. Even the most privacy-aware user wouldn't want to use a search engine that shows to them only a fraction of the web. It should probably be their choice if they want to click on a result hosted on Amazon or Medium. The search engine can ensure that their privacy is protected and that it doesn't collect any private data about users (or that could associate users to queries), but IMHO it shouldn't omit results that may be relevant for what the user actually needs to do.

If users can't find what they're looking for, they'll just fall back on another search engine, which defeats the whole purpose of running an alternative engine. And, if users fall back on another search engine too often, they'll eventually just stop using yours - a point that @thelinuxEXP made quite well in a recent video.

@blacklight
I see no trade-off if you include Mojeek results in /aggregate/ with other indexes. Doesn’t searx do a round-robin on results from the various sources? I’m glad to hear #Mojeek is coming up short on #Cloudflare sites -- that’s a very rare feature for privacy enthusiasts. Most self-proclaimed privacy-focused search services have failed to evolve beyond controlling their own privacy abuse & fall short of privacy-respecting results.
@thelinuxEXP
@blacklight
I don’t believe this for a second: “Even the most privacy-aware user wouldn't want to use a search engine that shows to them only a fraction of the web.” Seeing the wide open firehose of all possible web results is exactly what privacy-ambivilous users want. Privacy enthusiasts are burned out on seeing the garbage the web has become. IIRC a recent study showed that ~70k websites out of ~80k were sites clusterfucked w/js-trackers.
@thelinuxEXP

@blacklight
The primary job of a good search tool is to filter out (or down rank) the results you don’t want.

Searx instances that show Bing or Google results are a dime a dozen (regardless if they src from a syndicate). There’s nothing special about your instance or any other searx instance if it only proxies the giants. What makes a search svc stand out is
1→ unique sources (yacy, mojeek, etc), or
2→ unique filtering (anti-CF or anti-bloat).
@thelinuxEXP

@thelinuxEXP @blacklight If you’re not keen on offering privacy-respecting /results/, then another way to be distinguished from the rest is to offer bloat-free results. search·marginalia·nu and wiby.me both do that, but they’re both tor-hostile so I don’t use them. If a searx instance were to proxy those two I would bookmark it and use it. BTW, #exalead is another true search engine (in-house index)

@koherecoWatchdog @thelinuxEXP in Searx/SearxNG is quite easy, even on the client, to customize which engines you want to query. So if a user is bothered by results returned by Bing or DDG, they can simply turn off those engines and turn on only Mojeek/Gigablast instead. I would also love it if CF filtering was available as a feature that the client can toggle as it likes.

But, as admins, we shouldn't be too opinionated and take those decisions for the user. My priorities for a search engine are:

1. It should not track you, nor collect any data about you, nor show you unsolicited sponsored content.

2. It should be functional for its purpose - which is to return what the user is looking for in most of the cases.

A search engine like the one that you describe probably won't return results from more than half of the Web, and it's hard to tick the "be functional" box if you filter out so many results.

As a developer, a search engine that downranks results from StackOverflow, or doesn't return technical articles from Medium, dev.to or Hackernoon, would be almost useless.

A search engine that doesn't return (or downranks) results from the Guardian, the Economist or the CNN (or even Fox) would provide a very incomplete view of the world when you search for news.

A search engine that doesn't return results from LinkedIn or Glassdoor won't be of much help to someone is looking to hire or be hired.

A search engine that doesn't return results from any major sources for online shopping or reviews won't be very helpful if I want to buy a new product.

My point is that even the most privacy-enthusiast user sometimes needs to access websites that aren't 100% privacy respecting, because some content that they need may be only available on those websites.

And if my search engine doesn't return those results, then people will just go to another search engine that may have those results, but it's likely to be more privacy-invasive than mine. And that's exactly what I want to avoid: the purest search engine is literally of no use if people can't get relevant results out of it for their day-to-day activities.

@blacklight @thelinuxEXP Ah, right I forgot about the client side engine toggles. I think you’re using the term /relevancy/ different than the industry. E.g. Google does not determine relevancy purely as a function of raw content & search query. If a website has a non-white background, Google drops its relevancy. Relevancy is about knowing your audience. Cloudflare sites are irrelevant to privacy seekers.
@thelinuxEXP @blacklight Google’s audience is mainstream avg Joes who don’t give a shit about privacy, who use defenseless browsers (which work on most websites), who don’t want background color. If you simply run with Google’s rankings, you are also catering for those users. While privacy enthusiasts are not served well with that ranking b/c we have defensive browsers w/js disabled & a Tor IP.
@blacklight @thelinuxEXP The “It should be functional” box is most certainly not ticked when ¾ of the top ranking results lead to CF-pushed CAPTCHAs & sites so littered w/js they are dysfunctional in a secure browser. Those results /are/ irrelevant to privacy enthusiasts b/c we need sites that function, which don’t snoop on us. The privacy-hostile results are time-wasting pollution to privacy seekers
@thelinuxEXP @blacklight My workflow is to click down the list and open ~4—10 tabs & then run through the junk & hit control-w on the dysfunctional/garbage sites to get down to 1 or 2 that are fit. The search service should be doing that for me. Why doesn’t it? It’s because privacy seekers are in such a minority that no search service (except #Ombrelo) is willing to serve such a small audience.
@blacklight @thelinuxEXP Ombrelo saves me time because CF sites are treated as irrelevant. It still gives tor-hostile results though, so I have to go back & click on favicons to get mirrored versions of some sites. But #Ombrelo is the king of privacy respecting search because it knows the needs of the audience & no other search service has put user needs above Google & Microsoft.
@thelinuxEXP @blacklight If you were to design a search service that caters for privacy seekers, your userbase will of course shrink dramatically because we are a small marginalized group.

@koherecoWatchdog @thelinuxEXP "privacy seekers" is an umbrella term for people that fall on a wide spectrum. On one end, you have those who simply click on "Reject cookies", or use the browser's incognito mode, and they're fine with it. On the other end, you have people like you who uncompromisingly shape their whole online experience around the idea of absolute privacy. And you have a lot of shades of gray in between.

When I take decisions on how to shape the public services that I host, I try to aim at the middle point in this spectrum - the guy who wants no ads, no trackers and no bloat, and as little JS as possible, but who doesn't mind reading news articles on a major outlet, or following artists on Spotify, or discovering companies and co-workers on LinkedIn, or searching for programming questions on StackOverflow, or reading blogs on Medium. These users may decide to surf these websites like a guy who wears four condoms, just in case (with a DNS sinkhole, ad/tracker blocker, Tor, NoScript, some alternative client for those services, or all of these solutions), but they may still be interested in content that is only available on these platforms, and leaving out results from these platforms will lead to a bad experience, functionally speaking. The Web is powerful only when it doesn't get too opinionated about what results should reach the end user.

I try to also give freedom of configuration to those in the "privacy seekers" spectrum that are more uncompromising. For example, users can disable search engines that they don't want in their results, and I would also be happy to work on a PR for Searx/SearxNG to implement a user option to disable CF results if there's enough interest. But these settings shouldn't be the default, nor should the service be too opinionated and impose them to the user. If I do so, I may gain the trust of the more uncompromising users, but I will lose all the others in the middle of the spectrum. And those in the middle of the spectrum are likely to just go back to whatever crappy search engine they were using before, so they won't be better off.

If those who are more uncompromising decide that they won't get onboard with a solution just because it doesn't follow exactly their idea of how the Internet should work, then we'll keep getting fragmentation, forks and endless discussions, instead of creating solutions that appeal to the highest possible number of users and can cause a real dent in Big Tech's numbers.

When designing for privacy, I believe in striking reasonable trade-offs between providing a service that appeals to enough users (because function-wise it is similar to what they were using before, so migrating doesn't come with huge costs) and privacy purism. If you're too purist and pretend that any website that is somehow touched by Big Tech doesn't exist, then you lose users. In many cases, those users will just keep using Google, Bing, DDG or whatever, so they won't be any better off. If we push people away with our purism, we may be able to build our little happy bubble where Big Tech is not allowed in any form and shape, but we won't be able to build a convincing case for others to join us. And that really doesn't align with my mission - which is to raise the privacy level for as many people as possible, not to provide a privacy purist solution only for a niche.

@blacklight
I would not say you’re anywhere near the middle of that spectrum. At the most flimsy side of the range you have #DDG, #startpage, & Qwant, which deliver privacy-abusing results & also financially feed surveillance capitalists (#MCAG) by way of ad revenue, by paying for API access & maintaining link rankings, sending traffic where those tech giants intend.
@thelinuxEXP
@blacklight
Your #searx instance slightly improves on that by:
① avoiding ad revenue
② scraping results instead of paying tech giants for API access (guessing, as there is no mention of MS/Google relationships)
③ giving a cached link Your service moves the needle in the right direction by like ~15%; no different than other searx instances.
@thelinuxEXP
@blacklight
Your instance has a looong way to go:
① No onion URL for the search page
② No onion URL replacements (nytimes·com should be replaced w/nytimesn7cgm…onion)
③ No clearnet URL replacements (medium·com should be replaced w/scribe.rip; mojeek does this but mojeek must be turned on & even then scribe.rip is ranked lower than startpage’s medium·com) ④ No option to filter out CF results or even downrank them
@thelinuxEXP
@blacklight
⑤ Cloudflare results are not even red-flagged
⑥ The rankings favor privacy abuse (e.g. the very 1st result & many shortly below give 403/462 to Tor users, several Amazon shopping links often appear on the 1st page)
⑦ Cached links are offered for sites that are archive.org-hostile, like Quora. Those sites should at least be downranked and there should be an archive.ph link instead, as well as a red-flag to inform users of the problem.
@thelinuxEXP
@blacklight
⑧ The source code and issue tracker directs users into MS’s walled garden of #Github. There is no in-band way for users to report search issues.
⑨ The “settings→engines” tab incorrectly lists DDG, Qwant, & Startpage as “engines”. They are not. They are /syndicates/. Calling them engines alongside Mojeek & Gigablast is not just misinfo but it’s also an injustice that encourages users to regard syndicates as equals with engines.
@thelinuxEXP
@blacklight
The tab should be called “data sources” or “search services”, & thereunder should be a section for “engines” which only lists search *engines*. A section below that should be “meta-search services” (or alternatively a label that Facebook has not hi-jacked). In that section it should be clear that DDG & Qwant are MS syndicates, and that Startpage is a Google syndicate.
@thelinuxEXP
@blacklight
For whatever reason there is a widespread attitude among normies that meta-search service are somehow inferior to search engines (as if to be whisky snobs who find single malt whisky to always trump the blends). #Searx should take advantage of that & correctly tag the MS/Google-supporting syndicates with their true nature.
@thelinuxEXP
@thelinuxEXP @blacklight There are many more things that can be done to moving a search engine away from privacy theatre & closer to a meaningful level of privacy, but I stopped at ⑨. Most of the 9 points could be addressed w/out users having to put privacy above the instant gratification of seeing privacy-abusing results.
@blacklight @thelinuxEXP I also think you can separate privacy-respecting results without losing the undisciplined users by showing a vertical split: privacy-respecting results on the left, & privacy-hostile results on the right, giving two rankings.

@koherecoWatchdog @thelinuxEXP

1. Onion results: they would be there if they were implemented in the source code. Unfortunately, the only engine in SearX/SearXNG that seems to be configured with an onion_url is Ahmia. There is a PR for adding onion URLs to DDG: https://github.com/searxng/searxng/pull/1506 and a related open issue: https://github.com/searxng/searxng/issues/1505.

2. Onion URL for the engine: it would require me to set up the SOCKS proxies and all on the machine, as well as properly secure them. This is something that I'm planning to do, but you should also keep in mind that I'm self-hosting all of this stuff either on my Linode instance or my server at home (where I also run my Mastodon instance and Gitea server btw), and there's a physical limit to how much stuff I can self-host. Contributions for the costs of hosting and maintenance are welcome, if people expect me to run more services.

3. Clearnet URLs: they have been configured now.

4. Cloudflare flagging: again, it's not a feature that has been implemented in both SearX/SearXNG. Since spotting CF results isn't that difficult, I guess that PRs are welcome.

5. Rankings: AFAIK there's no way of tweaking with the rankings via configuration. Ranking are calculated on the basis of the engines' reliability and their own rankings. You can only completely exclude some domains from the results.

6. Red-flagging/downranking archive-hostile websites is another feature that isn't available in the upstream code.

7. The issue tracker links to Github because that's where both SearX and SearXNG are hosted, and that's where developers usually read what users reports. The alternatives would be: 1. either forking SearXNG on my Gitea instance and provide that link instead (something I'm very hesitant at doing, because that would make me the proxy maintainer for everything related to SearXNG), or 2. convince the maintainers to move the project somewhere else.

8. Naming issues ("engines" vs. "syndacates"): again, these should be addressed via PRs and discussions. Just because I'm running an open-source service it doesn't mean that I'm supposed to address/implement alone everything that should belong to the existing lifecycle of that open-source project.

Add onion url to duckduckgo engine. by m00nwtchr · Pull Request #1506 · searxng/searxng

What does this PR do? Adds the ddg onion url to the ddg engine. Why is this change important? Keeping communication within Tor is theoretically more private than using the clearnet version of the s...

GitHub

@koherecoWatchdog @thelinuxEXP and let's not forget that, even without many of the features you mentioned, we've got a meta search engine that doesn't track the users (and now it also provides clearnet links to privacy-friendly alternatives).

This is already something that most of the search engines out there can't guarantee, and it's already a big step forward.

I'm open to suggestions on how to further improve privacy even more, provided that:

1. Those suggestions are constructive. Again, if we bash one another on how purist we are when it comes to privacy, we lose the bigger picture - i.e. that most of the options out there are much worse when it comes to privacy, and anything we can provide more than them is already a step in the right direction.

2. The lifecycle of open-source projects is taken into account. I'm not the maintainer of SearXNG, nor a major contributor. I'm happy to provide features that can be implemented via configuration, but features that require code changes should come with issues/PRs on the relevant projects.

I also don't believe that privacy seekers are a "marginalized group" - this term is way abused nowadays. Yes, we are a minority, but we are a minority that made conscious choices because of some ethical beliefs of how technology should work: it's not exactly the same thing as race, religion, income level or sexual orientation. I'd be very careful to use the term "marginalized group": if we don't draw the line properly then tomorrow those who are into 1970s Cambodian psychedelic music may also identify themselves as a marginalized group.

And, in most of the cases, we are a minority with a high level of tech literacy and means to take things into our own hands. If we don't like how a search engine or a social network operates, in most of the cases we have the means to build or run our own alternative. And we can have open constructive discussions on how to shape those alternatives, instead of feeling discriminated just because we disagree on where the line lies in certain trade-offs.

@blacklight
Privacy seekers are most certainly a marginalized group, being treated like criminals (as dark skinned people often are in some regions). We choose to be privacy seekers just as a refugee often chooses to be a refugee. They’re not a refugee before they flee; they make a concious decision to flee from a bad situation and choose to favor marginalization in a less-hostile country over homeland hostilities. 1/5
@blacklight Actually I think religion-based discrimination would be more comparable to discrimination against privacy seekers (vs refugees). Refugees often choose to flee, but they are fleeing from a much sharper dose of oppression than privacy seekers. Whereas folks choose their religion (& thus potentially choose to be in a marginalized group) but not generally to flee anything.

@koherecoWatchdog this is exactly my point: as someone who was raised in a cult, I don't want us privacy seekers to become an uncompromising religion - and by "uncompromising" I mean "sacrificing basic aspects of social life in order to be coherent with our beliefs".

I consider myself a privacy enthusiast. I have been building and running privacy-aware services for years, I used to be part of the "crypto folks" in a time where "crypto" actually meant "PGP", and I have all the measures in place to make sure that trackers, ads and even unauthorized scripts don't land on any of my devices.

But I'm very skeptical about the practical utility of taking things further - like refusing to send emails to Google/Microsoft accounts, or refusing to have a Twitter account to write to an MP, or browsing a website proxied by Cloudflare, or refusing to register to a platform that uses reCAPTCHA.

Using these services sporadically (like when we need to write to a politician, or to a friend, or buy an item online, or register to a website, or access our online banking platform), while using privacy-aware services in the day-to-day activities, doesn't really move any needle IMHO. The cost of not using them when we need (like the inability of writing to an MP or to a friend, to purchase an item, or to transfer money online) is probably much higher than the minuscule data gain that #BigTech would make from you entering a text in reCAPTCHA, or Cloudflare proxying one or two requests.

Of course I'd like a world where I don't have to make these trade-offs, but that's not the world where we are right now. All we can do is build better alternatives and raise awareness on these issues, without pushing ourselves into an ideological niche that shuts out the whole world while having nearly no effect on the world. These are the things that cults would do, and I don't want us to be a marginalized minority because of some self-inflicted constraints that favour idealism over pragmatism.

@blacklight
You’ve opted to stop short of activism in order to make life easy. Doing so is to stop short of making a dent. You’ll have no progress w/out activism. When I have a contract with a supplier who mid-contract starts using Cloudflare to block me from accessing their service, I sue them. When your rights are infringed & you take the pushover direction instead of standing up for your rights, you become an enabler that allows the abuse to continue on to others.
@blacklight
If a shop demands you solve a Google reCAPTCHA before they serve you, your only just/activist move is to boycott them. If you don’t, you become part of the problem. If a tax-funded gov service for which you are entitled (like unemployment) denies you service by pushing a reCAPTCHA, you can’t boycott but you can do better: you can take legal action in those cases. It’s your civic duty. “Activism is the rent I pay for living on this planet.” — Alice Walker
@blacklight
Another good quote for this situation: “If you are neutral in situations of injustice, you have chosen the side of the oppressor. If an elephant has its foot on the tail of a mouse, and you say that you are neutral, the mouse will not appreciate your neutrality.” ― Desmond Tutu. When you need to communicate w/someone else, who decides the protocol, the person who requires more security or less?
@blacklight
The one who demands less security is the most unreasonable of the two b/c they expect the another to sacrifice security w/out good cause b/c they think “I have nothing to hide” is good cause for not using crypto. But in fact a sound infosec principle is operate securely /by default/. It is relaxing security that needs good rationale, not the other way around. And if you have the infosec background then it’s on you to exploit opportunities to enlighten.
@blacklight
There are probably many more problems than you are aware of w/sending msgs to gmail & MS users. It’s not just a snooping problem. It’s also a bullying problem, whereby tech giants have trounced RFC standards & purposes to impose their mandates for what hoops mail servers must go through ultimately to dance for the bully. The pushover move downplays the seriousness of the problem. Even if you say “I’ll email you but I don’t like it”, it’s not impactful.
@blacklight
I force gmail/MS-using orgs to reach me by fax or snailmail. They feel burdoned by hardcopy letters that they then have to pay postage on. Good. A European org probed to get an email address which I withheld from them and they used it w/out consent. I responded w/a written GDPR demand that they erase my email address from their system. And I made it clear that putting a surveillance capitalist in the loop is the issue.
@blacklight
If you go through Twitter’s hoops and ① buy a mobile phone ② subscribe to mobile phone service, & ③ disclose the # to Twitter you give up your right to legal action against the gov rep. The legal action carries more weight than whatever drop-in-the-ocean tweet you might have sent. On top of that, Twitter was caught abusing users’ phone #s, then leaking them.

@koherecoWatchdog I still consider myself an activist, but one that has already gone through many rounds of ideas-meet-the-real-world to be so uncompromising.

When I'm unhappy of how a company treats my data, or of how they lock up protocols and integrations, I build or host alternatives, I hack their services, or write articles on how to configure an alternative, as well as raising as much as possible the awareness on the issue, while making sure that the alternatives I provide are as much as possible on par with the features offered by the privacy-hostile/FLOSS-hostile company XYZ.

And I really consider the parity of features with the original service to be a baseline. An example: I've been very happy to replace Twitter/Medium/YouTube links in my search engine with links to my self-hosted Nitter/Scribe/Piped instances, because they are privacy-respecting alternatives that yet don't take away any of the features of the original product. But I'm against things like downranking or blocking some results, because *that* would mean taking features away - and, especially when it comes to search engines, we never know which features are useful to the users, and we shouldn't take that decision for them.

We must understand that not everybody is ready to take on all the battles, or significantly reduce their participation to digital life in order to be consistent with their ideas of how technology should work. Not everyone is ready to give up Whatsapp because their friends and relatives don't want to use an alternative. Not everyone is ready to give up Google Services on Android and lose access to their online banking app or to the online services provided by the government. Not everyone is ready to stop sending emails to Google/Microsoft accounts - and probably we shouldn't even respond to their barriers by creating more barriers.

Not everyone is ready to uncompromisingly pick all the battles, and we ought to respect that. If we don't, if we call those who shop on Amazon "undisciplined", or treat those who send emails to friends and colleagues who use Gmail/Outlook as someone who is completely giving up on their privacy, then 1. we won't make many friends, and 2. very few will be ready to come onboard.

It's like the uncompromising vegan who never loses a chance of calling his friends "murderers" when they eat meat. Or the Linux enthusiast of 20 years ago who would mock Windows users, and kept telling them that they should use a terminal-driven OS even if they were not into tech, and even if they didn't have games nor Photoshop/CAD. Or Stallman when he argued that non-free blobs should never ever touch the GNU/Linux system, even if that meant losing support for some hardware and pushing producers away from the Linux ecosystem instead of pulling them towards it.

The uncompromising approach that significantly reduces the ability for someone to do something, in exchange of some ideological gratification, has never worked in my experience. If we keep building better alternatives and we advertise them better, people will come naturally. If we keep telling them that they should stop using the Internet as they do now, while providing with no clear path to get the same functionalities out of other alternatives, then very few will join our battle. Sometimes a few small steps taken by many people are much more effective than some huge steps taken only by a small circle of activists.

@koherecoWatchdog in other words, my activism tries to work by extending, not removing. I don't want big tech to disappear tomorrow because of their evil.

I want them to stay, but I want them to swallow the same bitter embrace-extend-exterminate pill that they have pushed down our throats for four decades.

I don't want their search engines to disappear: I want the ability to provide stuff that scrapes the shit also out of their search results, and puts it all together in a container that is bigger than theirs.

I don't want Twitter to disappear: I want it to be forced to federate, so we can have full bidirectional interactions with Twitter users even from the Fediverse - and this container can be bigger than theirs.

I don't want Facebook and Instagram to disappear: I want to force them to open their APIs (and ideally provide authenticated RSS feeds for all of their content), so other clients can also pull and serve their content (minus ads and trackers), and our container can be bigger than theirs.

I don't want Whatsapp and Messenger to disappear: I want them to open up their APIs (ideally force them back to XMPP, or to something open like Matrix), and encourage people to use Matrix, XMPP or Bitlbee bridges, so they can communicate with anyone they want from the same platform without the need of 10 different messaging apps. Again, this is a container that can be bigger than theirs.

We shouldn't respond to barriers with more barriers. We need to reverse the roles here, force them to play a catch up game with us, because we can provide containers where their content is only a subset, instead of building smaller containers that actively exclude their content (and it's not even *theirs*, it's content created by other people that just so happened to use some of their services, and we shouldn't hide it on "religious" grounds). Force a competition through openness instead of a competition through closeness, because that's a game that we know much better than them how to win.

In my experience people are much more likely to change their habits when you provide them with a compelling and functionally equivalent (or superior, even better) alternative, rather than when you tell them what they shouldn't do.

We shouldn't create containers that exclude big tech: we should create containers that include them *and* extend them, with the added bonus of not tracking you and not showing any ads. Ideally, we should also force them to open up their APIs and protocols, because you can't have a real level playing field without those. In the meantime, we should keep scraping, reversing and hacking the shit out of them, so the content that just so happened to be hosted by other people on their platforms can be given back to people. Services like Piped, Scribe, Nitter, Teddit and Wikiless are really the right path forward IMHO, because they provide users with almost one-to-one equivalent alternatives. Once all the alternatives *really* have the same opportunities as big tech, we can let users pick the ones they like the most.

@blacklight It’s unclear why you are trying to support Facebook, Twitter, Amazon, etc. These are harmful entities. You somewhat imply that they are providing competition & options. But on the contrary they are anti-competitive. When a cheap diaper maker refused to let Amazon buy them, Amazon responded by pricing their own diapers below cost just to kill the small business that refused to sell. Killing Amazon enables many more players to enter.

@blacklight Killing #Twitter would cause elected public officials to move into the fedi, where they are reachable to all those whome they represent. Killing #Facebook would effectively force /public/ schools to stop excluding non-FB students & give them the access to school resources & information that they are (or should be) entitled to.

#deleteFacebook #deleteTwitter

@blacklight Also consider this story:
http://techrights.org/2022/03/20/brussels-police-facebook/

Without #Facebook, the Brussels police would be serving all citizens, not just FB pawns. (note that I say “pawns” instead of “users” to emphasize that they are /used by/ FB)

Brussels Police: No Help Unless You Open an Account at Facebook

Facebook account required for bicycle theft victims to see whether their bike was recovered by Brussels police

Techrights
@blacklight This reminds me of a discussion about killing Hitler (motivated by the documentary “23? ways to kill Hitler”). Killing Hitler actually saves many lives. OTOH, some believe killing Hitler under any circumstances is unnecessary, hypocritical & solves nothing b/c Hitler would be replaced. I would have bet that his replacement would have to be less of a monster.

@koherecoWatchdog I'm quite far from supporting them. My aim is to decouple the content from the container.

Posts on Facebook, products sold on Amazon or emails on Gmail don't belong to those companies. They belong to the guy who shared an article, or the seller who sells a product, or the people exchanging emails. Those companies only act as intermediaries and only own the container. When you send a physical letter, the post service or the postman don't automatically become the owners of it, nor they can claim intellectual property rights. Why shouldn't it be the same with tech? This is the big contradiction that I'm trying to fix.

The Web I'd like to build doesn't "ban" content from these platforms. Quite the opposite: it scrapes it, it mocks it, it replicates it all over the place, it packs it all in RSS feeds that can be subscribed from anywhere, it gives it back to the people, so the value of those containers gets diluted.

Once you can reach out to an MP on Twitter also through the Fediverse (because Twitter has been forced to federate, and we can virtually treat it just like another Fediverse instance), you have no reason to keep using twitter.com.

Once you can consume Facebook content from any client (because Facebook has been forced to open up its APIs and protocols), you have no reason to keep using their app and serve your data to their trackers.

Once you can communicate with your friends over WhatsApp without using the WhatsApp app (because you use a Matrix or XMPP bridge), you have no reason to keep that app on your phone and feed telemetry data to it.

Once you can use a Searx instance, and that instance returns results as good as Google's (and even more results from other engines), you have no reason to keep using Google as a search engine at all - even as a fallback.

Once you can read all the articles on Medium through Scribe, you no longer have to pay $5 for a monthly subscription.

My preferred approach is to take down the giants by diluting their value, by making sure that they are no longer the owners of what their users create on them, and by making it very clear that they're just a container, and once their content is out in the wild their value is close to zero. Our war should be against locked containers and its aim should be to liberate the content that is in them, not to destroy containers and content.

It's not a coincidence that big tech is waging a war against this approach - by treating scraping and ad-blocking as a form of piracy, by sending me passive aggressive emails like "we remind you that Google makes its money through ads", and by blocking on a weekly basis whatever mechanisms we use on NewPipe or Barinsta to respectively scrape YouTube videos and Instagram posts. It's because these strategies hurt them much more than a 1% of users walking away while 99% stays with them. It's because they make people realize that they aren't so locked in as they think, and that walking away doesn't have to come with too many trade-offs.

If we respond to barriers with other barriers, what will we change? Probably not much. Most of the users expected to create a new email account to register to our service will just walk away and create an account somewhere else. And the walls of the gardens that these companies have built around their users won't become suddenly visible if, in the perception of those users, 99% lives within those walls and only 1% is outside.

@blacklight
> Once all the alternatives *really* have the same opportunities as big tech, we can let users pick the ones they like the most.

That doesn’t work. Features is not the deciding factor for users. It doesn’t matter how better serving your platform is, people flock to where the people are. Beta was better than VHS in every way, but VHS still won the popularity contest. Even w/in the fedi users flock to mastodon.social despite heavy & under-handed censorship.

@blacklight I just read about a #Mastodon user who quit the fedi & went back to Twitter b/c of the lacking number of followers. Protonmail made the same move as well.
@blacklight
The error is in thinking that downranking is taking something away. When you downrank a junk result, it doesn’t leave an empty space in the results. A more worthy link gets upranked thus more exposure. It’s not a feature to have the first few results take users to a 403 forbidden page or a CAPTCHA. Those are anti-features that you promote by letting Google & MS have free reign over the ranking. Google’s rankings are not user-focused; they are profit-focused.
@blacklight Nitter gives no way to respond to posts, & ironically nitter is much faster to block readers than (logged out) twitter. So it does take something away to show nitter results instead of twitter & thus deviates from your thesis. The nitter replacement is still overall positive b/c it mitigates Twitter traffic/patronage & it’s good to discourage Twtr interactions. But it needs to evolve to support out-of-band non-Twitter replies.
@blacklight The goal of tech giants is to make products out of the zombie users you refer to, who are too lazy/distracted to seek out alternatives, or even become informed about the issues. It doesn’t help to support that codependency. These users flow in whatever direction they are pulled. If you don’t pull, you help the oppressor who is pulling them, as Desmond Tutu would say.
@blacklight Barriers /are/ the answer to barriers. #Gmail & MS users don’t even see the barriers. They are uninformed; effectively zombies. Creating a barrier around Gmail & Outlook gets visibility on the problem. I tell those users “I cannot email you because Google/MS blocks my mail server”. This is how they become aware of the oppressor-manufactured barrier. Then I say “give me something that works, like protonmail or tutanota, or Snikket, etc”.
@blacklight Another way to see this is that there are platforms with a barrier that is invisible to insiders. Creating a visible barrier around those platforms enables the insiders to see that there is in fact a barrier.