Tarpits like Nephentes are cool, but can't I just bypass them by waiting for "network idle" before scraping and setting a reasonable timeout for that?
Tarpits like Nephentes are cool, but can't I just bypass them by waiting for "network idle" before scraping and setting a reasonable timeout for that?
Log-surfing part 2: what was the easiest way to aggregate this #Nephentes output by IP-ranges? Don’t feel like scripting today…Also I should probably eventually update…
The good thing about #Nephentes is that it ups my #nginx -game.
I have to admit I never properly digested the docs, but DID YOU KNOW that `proxy_pass` changes behaviour when using variables!
Nice article here by @dalbuschat that helped me redirect even more useless bots that hit 404s repeatedly into the AI maze via
location ~ .. {
set $upstream http://localhost:port;
set proxy_pass $upstream/ai/;
}
https://dev.to/danielkun/nginx-everything-about-proxypass-2ona

I guess you have all heard about the growing problem of AI companies trying to aggressively collect whatever data they can get their hands on to train their models. This has caused an explosive surge in web crawlers relentlessly hitting servers big and small. But who runs these crawlers? Turns out — it could be you!
I apologise in advance for the bad quality of future Alibaba-LLMs, but if you don't respect robots.txt, off into the stochastic maze you go!
Finally got around to setting up the #nephentes AI crawler maze. Two birds with one stone: we had two extremely irritating crawlers persistently FOR YEARS hitting non-existing URLs via http, i.e., they first got a redirect to https and then the 404.
Now they will enter The mAIze(tm) on https (didn't figure out how to short-circuit the http-upgrade yet). Neat side-effect: I now know more about #nginx request handling and rewriting than I ever wanted to know 🥴
Die #Nephentes braucht mal neues Substrat. Habe endlich einen Händler gefunden, wo es alle Zutaten gibt, um es ihr richtig schön fluffige im die Wurzeln zu machen.
Neben Torf für #FleischfressendePflanzen, was ich hier noch habe (in Zukunft dann torffrei mit Kokusfasern) kommen ordentlich Pinienrinde und ne handvoll Perlite zusammen.
Das alte Substrat war schon ein pampiger Klumpen.
Das Sphagnum kommt dann im zukünftigen Moorkübel (Balkonkasten) zum Einsatz.
is it darwinistic self-selection if my cat insists on licking a carnivorous plant? 🤦
any #nephentes nerds on here to help me fix his tiny little brain? 🤦