Tarpits like Nephentes are cool, but can't I just bypass them by waiting for "network idle" before scraping and setting a reasonable timeout for that?

#Tarpit #AI #Nephentes

Log-surfing part 2: what was the easiest way to aggregate this #Nephentes output by IP-ranges? Don’t feel like scripting today…Also I should probably eventually update…

@algernon

@algernon I've seen #Iocaine and #Nephentes referred to as #TarPits.

The good thing about #Nephentes is that it ups my #nginx -game.

I have to admit I never properly digested the docs, but DID YOU KNOW that `proxy_pass` changes behaviour when using variables!

Nice article here by @dalbuschat that helped me redirect even more useless bots that hit 404s repeatedly into the AI maze via

location ~ .. {
set $upstream http://localhost:port;
set proxy_pass $upstream/ai/;
}

https://dev.to/danielkun/nginx-everything-about-proxypass-2ona

Nginx: Everything about proxy_pass

Information about using proxy_pass in nginx, and workarounds for it's quirks.

DEV Community
📰 Com es viu l'espoli de coneixement des del nostre costat:

1️⃣ Molts servidors autogestionats estan patint atacs continus d'empreses de IA per robar-los tota la informació possible, saturant els serveis, reintentant constantment.
2️⃣ Venen de molts llocs (adreces IP) diferents, i es camuflen amb trànsit real, dificultant els bloqueigs.
3️⃣ No respecten els desitjos de lis admins expressats amb els arxius `robots.txt`.

🆕 Una persona ( @jwildeboer ) ha vinculat aquest fenomen, amb un model de negoci obscur i grotesc 🤮 Infectar aplicacions de mòbils per utilitzar-nos a les usuàries com a còmplices d'aquestes empreses espoliadores 🤖💀

🕵 Ho explica en detall en anglès a https://jan.wildeboer.net/2025/04/Web-is-Broken-Botnet-Part-2/

👀 També podeu llegir (també en anglès) l'experiència d'un sysadmin (Drew DeVault) defensant-se d'aquests atacs: https://drewdevault.com/2025/03/17/2025-03-17-Stop-externalizing-your-costs-on-me.html

El resum és que ens estant utilitzant a totis com a esclaves de la seva megalomania, i tot per inflar més la bombolla de la #IA que acabarà esclatant.

Per això @cadey ha creat #anubis , una eina que alenteix totes les connexions inicials forçant uns càlculs matemàtics https://github.com/TecharoHQ/anubis ⏰ i @aaron ha creat #nephentes , una eina per servir contingut aleatori als bots de IA infinitament per enverinar els seus models de llenguatge ☠

Informarem del que puguem fer com a usuàries per evitar aquest atac a internet. Per ara, si impulseu aquesta publicació ja ens ajudeu molt!

RE: https://social.wildeboer.net/@jwildeboer/114359011037497199
Botnet Part 2: The Web is Broken

I guess you have all heard about the growing problem of AI companies trying to aggressively collect whatever data they can get their hands on to train their models. This has caused an explosive surge in web crawlers relentlessly hitting servers big and small. But who runs these crawlers? Turns out — it could be you!

Jan Wildeboer's Blog

I apologise in advance for the bad quality of future Alibaba-LLMs, but if you don't respect robots.txt, off into the stochastic maze you go!

#nephentes @aaron

Finally got around to setting up the #nephentes AI crawler maze. Two birds with one stone: we had two extremely irritating crawlers persistently FOR YEARS hitting non-existing URLs via http, i.e., they first got a redirect to https and then the 404.

Now they will enter The mAIze(tm) on https (didn't figure out how to short-circuit the http-upgrade yet). Neat side-effect: I now know more about #nginx request handling and rewriting than I ever wanted to know 🥴

#ryows

Die #Nephentes braucht mal neues Substrat. Habe endlich einen Händler gefunden, wo es alle Zutaten gibt, um es ihr richtig schön fluffige im die Wurzeln zu machen.
Neben Torf für #FleischfressendePflanzen, was ich hier noch habe (in Zukunft dann torffrei mit Kokusfasern) kommen ordentlich Pinienrinde und ne handvoll Perlite zusammen.

Das alte Substrat war schon ein pampiger Klumpen.

Das Sphagnum kommt dann im zukünftigen Moorkübel (Balkonkasten) zum Einsatz.

is it darwinistic self-selection if my cat insists on licking a carnivorous plant? 🤦

any #nephentes nerds on here to help me fix his tiny little brain? 🤦

I saw such a huge #nephentes today!