Posts nur für den Feed

Ich las vorhin davon, dass Sourcefeed einem ein Feed-Only-Publishing anbietet. Irgendwie klingt das erstmal wild, aber wenn man darüber nachdenkt, ist es wie ein Podcast für Blogs. Du machst quasi Blogbeiträge nur für Leute, die dich im Feedreader abonniert haben.

Mir gefällt dieser Gedanke. Auf der „Why“-Seite gibt es noch ein paar gute Gründe, wie beispielsweise die Tatsache, dass die typischen AI-Scraper die Seiten nicht als Trainingsmaterial verwenden, weil sie Feeds ignorieren.

Na, mal sehen. Ich würde jetzt kein Geld für so ein Feature ausgeben, deshalb habe ich das einfach mal hier im Blog verbaut. Diese Posts werden vorerst noch ganz normal ins Fediverse gepusht, aber mal schauen, wie lange noch.

Deshalb: Feed abonnieren.

🔗

#shortpost #blogpost #blog #Feed #Sourcefeed #FeedOnly #AI #Scraper

Sourcefeed

Pop-up RSS publishing.

Sourcefeed

I find it very annoying that NotebookLM doesn't transcode files automatically if they are too big.

I also find it annoying that you cannot easily snap and add large websites with a lot of scrolling.

But if you want notebooks, you've got to add data...

#notebooklm #scraper

Mehr Features gegen Idioten

Ihr kennt es: Egal, wie sehr ihr versucht, die Idioten auszusperren, sie finden einen anderen Weg, doch die Seite zu sehen/scrapen.

zum Blogpost…

#blogpost #blog #GeoIP #Scraper #Idioten

Mehr Features gegen Idioten · mthie spaces

Ihr kennt es: Egal, wie sehr ihr versucht, die Idioten auszusperren, sie finden einen anderen Weg, doch die Seite zu sehen/scrapen.

mthie spaces

RE: https://mastodon.online/@NatureMC/116442840702472493

And it works! Since blocking the worst #scraping bots, all these "visitors" staying only some seconds on a blog article, are gone. 👍 And I can tell you that they were *many*, often more than real humans reading my blog. Illegal training of LLMs & Co. is a bigger problem than many are aware of.

#noAI #scraper #blogging #blogger #bloggingCommunity

Some updates on my website maintenance woes:

Starting last July, I built a new wiki for my translations of German folk tales. And soon after I started doing so, it started to experience frequent, hours-long outages. I started to research possible causes, but eventually concluded that the primary cause were so many requests from anonymous #scraper bot networks deserpate for new scraps of data to feed into their #LLM models that the wiki simply couldn't cope. Even when I increased my hosting plan _twice_ last September, this only served to make the outages less common - not to stop them.

In March, I drastically reduced the amount of work I did on the wiki, as it was functionally complete - I had added more than 700 folk tales to it by that stage. Sure, there are always further tales to add - I didn't stop translating those tales, after all. But now I am adding 10-20 tales per month, not 100+.

And funnily enough, I haven't noticed any major outages for this past month - or even minor ones. I guess the scraper bot networks noticed that I don't have that much new data to steal, and largely moved on to new prey they can harass.

So, what can we conclude from this?

If you are maintaining a website that produces lots of new content on a regular basis, you _will_ get hammered by these scrapers. robots.txt will do nothing - these use anonymous, ever-changing IP addresses. Maybe you can thwart them with #Cloudfare or similar technologies which I haven't tried out (I am a rank beginner when it comes to website administration, to be frank).

Otherwise you will either have to slow down the publication of new content, pay lots of money for an oversized hosting plan, or live with periodic outages until the #AIBubble bursts, and there is no longer a trillion dollar business case for scraping every website a thousand times a month.

https://wiki.sunkencastles.com/wiki/Main_Page

Sunken Castles, Evil Poodles Wiki

Backing up Spotify - Anna’s Blog

"We backed up Spotify (metadata and music files). It’s distributed in bulk torrents (~300TB). It’s the world’s first “preservation archive” for music which is fully open (meaning it can easily be mirrored by anyone with enough disk space), with 86 million music files, representing around 99.6% of listens."

Link: https://annas-archive.li/blog/backing-up-spotify.html

#linkdump #archive #blogpost #scraper #spotify

Scrapers disguising as Googlebot is a thing. You can block them with Modsecurity fake-bot plugin:

https://github.com/coreruleset/fake-bot-plugin

#Googlebot #crawler #bot #scraper #modsecurity

Das Thema AI-Crawler, Bots und Scraper macht auch vor dem Fediverse nicht halt. Nicht nur, dass hier unerwünscht Daten von Usern "abgesaugt" werden, diese Abfragen erzeugen auch immer mehr Last auf den Servern - viel unnötiger Traffic, der da entsteht und Ressourcen kostet.

Vieles lässt sich durch die robots.txt aussperren, aber leider hält sich nicht jeder Bot an diese. Natürlich ist auch von hartnäckigen Bots und Spidern ein Ausschluss in der .htaccess oder Apache-Config möglich. Auch mit Fail2Ban lässt sich da eine Menge bauen.

Allerdings ist es auch empfehlenswerter, diese IPs gleich kernelnah, also z.B. über nftables, zu sperren. Leider sind es nicht nur ein paar IPs, leider kann man nicht alle IPs dieser Bots sperren, weil auch immer wieder neue IPs hinzukommen. Es gibt aber bestimmte IPs und Adressbereiche, die von diesen Bots genutzt werden. Die Infos habe ich bei den "Anbietern" selbst gefunden, mit Ausnahme von SEMrush, der wirklich sehr aggressiv crawlen kann und gerne mal die robots.txt völlig ignoriert:

openai.com/gptbot.json
openai.com/chatgpt-user.json
openai.com/searchbot.json
docs.claude.com/en/api/ip-addr…
developers.google.com/static/s…
bing.com/toolbox/bingbot.json
search.developer.apple.com/app…
perplexity.com/perplexitybot.j…
perplexity.com/perplexity-user…
duckduckgo.com/duckduckgo-help…
index.commoncrawl.org/ccbot.js…
ipinfo.io/AS209366

Ja, das sind natürlich nicht alle Bots, Crawler und Co! Das ist mir bewusst. Aber ein großer Teil Traffic verschwindet damit schon!

Da ich auf meinen Friendica-Instanzen aber auch keinen Besuch von Google und Co. haben möchte, habe ich heute mal einen Rundumschlag gemacht und diese IPs von oben genannten Seiten in nftables reingeworfen. Ergebnis: Es ist bedeutend ruhiger geworden! Und dabei sind es bei weitem natürlich nicht alle Bots und Crawler, aber doch schon einige!

Ich weiß nicht, ob jemand Interesse an den IPs hat, ich stelle diese hier einfach mal zur Verfügung. Quelle sind wie bereits erwähnt oben stehende Seiten. Nutzung natürlich auf eigene Gefahr. 😉

IPv4-Adressen:

132.196.86.0/24,172.182.202.0/25,172.182.204.0/24,172.182.207.0/25,172.182.214.0/24,172.182.215.0/24,20.125.66.80/28,20.171.206.0/24,20.171.207.0/24,4.227.36.0/25,52.230.152.0/24,74.7.175.128/25,74.7.227.0/25,74.7.227.128/25,74.7.228.0/25,74.7.230.0/25,74.7.241.0/25,74.7.241.128/25,74.7.242.0/25,74.7.243.128/25,74.7.244.0/25,104.210.139.192/28,104.210.139.224/28,13.65.138.112/28,13.65.138.96/28,13.67.46.240/28,13.67.72.16/28,13.70.107.160/28,13.71.2.208/28,13.76.115.224/28,13.76.115.240/28,13.76.116.80/28,13.76.223.48/28,13.76.32.208/28,13.79.43.0/28,13.83.167.128/28,13.83.237.176/28,132.196.82.48/28,135.119.134.128/28,135.119.134.192/28,135.220.73.208/28,135.220.73.240/28,135.237.131.208/28,135.237.133.48/28,137.135.183.96/28,137.135.190.240/28,137.135.191.176/28,137.135.191.32/28,138.91.30.48/28,138.91.46.96/28,168.63.252.240/28,172.178.140.144/28,172.178.141.112/28,172.178.141.128/28,172.183.143.224/28,172.183.222.128/28,172.196.40.208/28,172.202.102.112/28,172.204.16.64/28,172.204.27.16/28,172.212.159.64/28,172.213.11.144/28,172.213.12.112/28,172.213.21.112/28,172.213.21.144/28,172.213.21.16/28,172.215.218.96/28,191.233.1.112/28,191.233.1.128/28,191.233.1.224/28,191.233.194.32/28,191.233.196.112/28,191.233.199.160/28,191.233.2.0/28,191.234.167.128/28,191.235.66.16/28,191.235.98.144/28,191.235.99.80/28,191.237.249.64/28,191.239.245.16/28,20.0.53.96/28,20.102.212.144/28,20.117.22.224/28,20.125.112.224/28,20.125.144.144/28,20.161.75.208/28,20.168.7.192/28,20.168.7.240/28,20.169.72.112/28,20.169.72.96/28,20.169.73.176/28,20.169.73.32/28,20.169.73.64/28,20.169.78.112/28,20.169.78.128/28,20.169.78.144/28,20.169.78.160/28,20.169.78.176/28,20.169.78.192/28,20.169.78.208/28,20.169.78.48/28,20.169.78.64/28,20.169.78.80/28,20.169.78.96/28,20.169.86.224/28,20.169.86.240/28,20.169.87.112/28,20.172.29.32/28,20.193.233.240/28,20.193.50.32/28,20.194.0.208/28,20.194.1.0/28,20.194.157.176/28,20.198.67.96/28,20.203.245.32/28,20.204.24.240/28,20.206.107.192/28,20.210.154.128/28,20.210.174.208/28,20.210.211.192/28,20.215.187.208/28,20.215.188.192/28,20.215.214.16/28,20.215.219.128/28,20.215.219.160/28,20.215.219.208/28,20.215.220.112/28,20.215.220.128/28,20.215.220.144/28,20.215.220.160/28,20.215.220.176/28,20.215.220.192/28,20.215.220.208/28,20.215.220.64/28,20.215.220.80/28,20.215.220.96/28,20.226.32.80/28,20.227.140.32/28,20.228.106.176/28,20.235.75.208/28,20.235.87.224/28,20.249.63.208/28,20.27.94.128/28,20.45.178.144/28,20.55.229.144/28,20.63.221.64/28,20.90.7.144/28,20.97.189.96/28,23.102.140.144/28,23.102.141.32/28,23.97.109.224/28,23.98.142.176/28,23.98.179.16/28,23.98.186.176/28,23.98.186.192/28,23.98.186.64/28,23.98.186.96/28,4.151.119.48/28,4.151.241.240/28,4.151.71.176/28,4.189.118.208/28,4.189.119.48/28,4.196.118.112/28,4.196.198.80/28,4.197.115.112/28,4.197.19.176/28,4.197.22.112/28,4.197.64.0/28,4.197.64.16/28,4.197.64.48/28,4.197.64.64/28,4.205.128.176/28,40.116.73.208/28,40.122.235.112/28,40.67.183.160/28,40.67.183.176/28,40.75.14.224/28,40.81.134.128/28,40.81.134.144/28,40.81.234.144/28,40.84.181.32/28,40.84.221.208/28,40.84.221.224/28,51.107.70.192/28,51.8.155.48/28,51.8.155.64/28,51.8.155.80/28,52.148.129.32/28,52.154.22.48/28,52.156.77.144/28,52.159.227.32/28,52.159.249.96/28,52.165.212.16/28,52.165.212.32/28,52.165.212.48/28,52.172.129.160/28,52.172.251.112/28,52.173.123.0/28,52.173.219.112/28,52.173.219.96/28,52.173.221.16/28,52.173.221.176/28,52.173.221.208/28,52.173.234.16/28,52.173.234.80/28,52.173.235.80/28,52.176.139.176/28,52.187.246.128/28,52.190.137.144/28,52.190.137.16/28,52.190.139.48/28,52.190.142.64/28,52.190.190.16/28,52.225.75.208/28,52.230.163.32/28,52.230.164.176/28,52.231.30.48/28,52.231.34.176/28,52.231.39.144/28,52.231.39.192/28,52.231.49.48/28,52.231.50.64/28,52.236.94.144/28,52.242.132.224/28,52.242.132.240/28,52.242.245.208/28,52.252.113.240/28,52.255.109.112/28,52.255.109.128/28,52.255.109.144/28,52.255.109.80/28,52.255.109.96/28,52.255.111.0/28,52.255.111.112/28,52.255.111.16/28,52.255.111.32/28,52.255.111.48/28,52.255.111.80/28,57.151.131.224/28,57.154.174.112/28,57.154.175.0/28,57.154.187.32/28,68.154.28.96/28,68.218.30.112/28,68.220.57.64/28,68.221.67.160/28,68.221.67.192/28,68.221.67.224/28,68.221.67.240/28,68.221.75.16/28,74.226.253.160/28,74.249.86.176/28,74.7.35.112/28,74.7.35.48/28,74.7.36.64/28,74.7.36.80/28,74.7.36.96/28,9.160.163.224/28,9.160.164.128/28,104.210.140.128/28,135.234.64.0/24,172.182.193.224/28,172.182.193.80/28,172.182.194.144/28,172.182.194.32/28,172.182.195.48/28,172.182.209.208/28,172.182.211.192/28,172.182.213.192/28,172.182.224.0/28,172.203.190.128/28,20.14.99.96/28,20.168.18.32/28,20.169.6.224/28,20.169.7.48/28,20.169.77.0/25,20.171.123.64/28,20.171.53.224/28,20.25.151.224/28,20.42.10.176/28,4.227.36.0/25,40.67.175.0/25,40.90.214.16/28,51.8.102.0/24,74.7.175.128/25,74.7.228.0/25,74.7.228.128/25,74.7.229.0/25,74.7.229.128/25,74.7.230.0/25,74.7.241.128/25,74.7.242.128/25,74.7.243.0/25,74.7.244.0/25,160.79.104.0/21,192.178.4.0/27,192.178.4.128/27,192.178.4.160/27,192.178.4.192/27,192.178.4.224/27,192.178.4.32/27,192.178.4.64/27,192.178.4.96/27,192.178.5.0/27,192.178.6.0/27,192.178.6.128/27,192.178.6.160/27,192.178.6.192/27,192.178.6.224/27,192.178.6.32/27,192.178.6.64/27,192.178.6.96/27,192.178.7.0/27,192.178.7.128/27,192.178.7.160/27,192.178.7.192/27,192.178.7.224/27,192.178.7.32/27,192.178.7.64/27,192.178.7.96/27,34.100.182.96/28,34.101.50.144/28,34.118.254.0/28,34.118.66.0/28,34.126.178.96/28,34.146.150.144/28,34.147.110.144/28,34.151.74.144/28,34.152.50.64/28,34.154.114.144/28,34.155.98.32/28,34.165.18.176/28,34.175.160.64/28,34.176.130.16/28,34.22.85.0/27,34.64.82.64/28,34.65.242.112/28,34.80.50.80/28,34.88.194.0/28,34.89.10.80/28,34.89.198.80/28,34.96.162.48/28,35.247.243.240/28,66.249.64.0/27,66.249.64.128/27,66.249.64.160/27,66.249.64.192/27,66.249.64.224/27,66.249.64.32/27,66.249.64.64/27,66.249.64.96/27,66.249.65.0/27,66.249.65.128/27,66.249.65.160/27,66.249.65.192/27,66.249.65.224/27,66.249.65.32/27,66.249.65.64/27,66.249.65.96/27,66.249.66.0/27,66.249.66.128/27,66.249.66.160/27,66.249.66.192/27,66.249.66.224/27,66.249.66.32/27,66.249.66.64/27,66.249.66.96/27,66.249.67.0/27,66.249.67.32/27,66.249.67.64/27,66.249.68.0/27,66.249.68.128/27,66.249.68.160/27,66.249.68.192/27,66.249.68.32/27,66.249.68.64/27,66.249.68.96/27,66.249.69.0/27,66.249.69.128/27,66.249.69.160/27,66.249.69.192/27,66.249.69.224/27,66.249.69.32/27,66.249.69.64/27,66.249.69.96/27,66.249.70.0/27,66.249.70.128/27,66.249.70.160/27,66.249.70.192/27,66.249.70.224/27,66.249.70.32/27,66.249.70.64/27,66.249.70.96/27,66.249.71.0/27,66.249.71.128/27,66.249.71.160/27,66.249.71.192/27,66.249.71.224/27,66.249.71.32/27,66.249.71.64/27,66.249.71.96/27,66.249.72.0/27,66.249.72.128/27,66.249.72.160/27,66.249.72.192/27,66.249.72.224/27,66.249.72.32/27,66.249.72.64/27,66.249.73.0/27,66.249.73.128/27,66.249.73.160/27,66.249.73.192/27,66.249.73.224/27,66.249.73.32/27,66.249.73.64/27,66.249.73.96/27,66.249.74.0/27,66.249.74.128/27,66.249.74.160/27,66.249.74.192/27,66.249.74.224/27,66.249.74.32/27,66.249.74.64/27,66.249.74.96/27,66.249.75.0/27,66.249.75.128/27,66.249.75.160/27,66.249.75.192/27,66.249.75.224/27,66.249.75.32/27,66.249.75.64/27,66.249.75.96/27,66.249.76.0/27,66.249.76.128/27,66.249.76.160/27,66.249.76.192/27,66.249.76.224/27,66.249.76.32/27,66.249.76.64/27,66.249.76.96/27,66.249.77.0/27,66.249.77.128/27,66.249.77.160/27,66.249.77.192/27,66.249.77.224/27,66.249.77.32/27,66.249.77.64/27,66.249.77.96/27,66.249.78.0/27,66.249.78.128/27,66.249.78.160/27,66.249.78.32/27,66.249.78.64/27,66.249.78.96/27,66.249.79.0/27,66.249.79.128/27,66.249.79.160/27,66.249.79.192/27,66.249.79.224/27,66.249.79.32/27,66.249.79.64/27,157.55.39.0/24,207.46.13.0/24,40.77.167.0/24,13.66.139.0/24,13.66.144.0/24,52.167.144.0/24,13.67.10.16/28,13.69.66.240/28,13.71.172.224/28,139.217.52.0/28,191.233.204.224/28,20.36.108.32/28,20.43.120.16/28,40.79.131.208/28,40.79.186.176/28,52.231.148.0/28,20.79.107.240/28,51.105.67.0/28,20.125.163.80/28,40.77.188.0/22,65.55.210.0/24,199.30.24.0/23,40.77.202.0/24,40.77.139.0/25,20.74.197.0/28,20.15.133.160/27,40.77.177.0/24,40.77.178.0/23,17.241.208.160/27,17.241.193.160/27,17.241.200.160/27,17.22.237.0/24,17.22.245.0/24,17.22.253.0/24,17.241.75.0/24,17.241.219.0/24,17.241.227.0/24,17.246.15.0/24,17.246.19.0/24,17.246.23.0/24,107.20.236.150/32,3.224.62.45/32,18.210.92.235/32,3.222.232.239/32,3.211.124.183/32,3.231.139.107/32,18.97.1.228/30,18.97.9.96/29,44.208.221.197/32,34.193.163.52/32,18.97.21.0/30,18.97.43.80/29,130.107.228.224/32,104.43.54.127/32,104.43.55.116/32,104.43.55.117/32,104.43.55.166/32,104.43.55.167/32,108.141.83.74/32,13.89.106.77/32,172.169.17.165/32,191.233.3.197/32,191.233.3.202/32,191.234.216.178/32,191.234.216.4/32,191.235.201.214/32,191.235.202.38/32,191.235.202.48/32,20.113.14.159/32,20.113.3.121/32,20.12.141.99/32,20.185.79.15/32,20.185.79.47/32,20.191.44.119/32,20.191.44.16/32,20.191.44.22/32,20.191.44.234/32,20.191.45.212/32,20.193.12.126/32,20.193.24.10/32,20.193.24.251/32,20.193.25.197/32,20.193.27.215/32,20.193.45.113/32,20.195.108.47/32,20.197.209.11/32,20.197.209.27/32,20.201.15.208/32,20.204.240.172/32,20.204.241.148/32,20.204.242.101/32,20.204.242.19/32,20.204.243.55/32,20.204.246.254/32,20.204.246.81/32,20.207.107.181/32,20.207.72.11/32,20.207.72.110/32,20.207.72.113/32,20.207.72.21/32,20.207.97.190/32,20.207.99.197/32,20.219.43.246/32,20.219.45.190/32,20.219.45.67/32,20.226.133.105/32,20.232.184.230/32,20.3.1.178/32,20.40.133.240/32,20.43.150.85/32,20.43.150.93/32,20.43.172.120/32,20.44.222.1/32,20.49.136.28/32,20.50.168.91/32,20.50.48.159/32,20.50.48.192/32,20.50.49.0/32,20.50.49.237/32,20.50.49.25/32,20.50.49.40/32,20.50.49.55/32,20.50.50.118/32,20.50.50.121/32,20.50.50.123/32,20.50.50.130/32,20.50.50.134/32,20.50.50.145/32,20.50.50.146/32,20.50.50.163/32,20.50.50.46/32,20.53.134.160/32,20.53.78.106/32,20.53.78.123/32,20.53.78.138/32,20.53.78.144/32,20.53.78.236/32,20.53.91.2/32,20.53.92.211/32,20.56.197.58/32,20.56.197.63/32,20.61.34.40/32,20.62.224.44/32,20.71.12.143/32,20.72.242.93/32,20.73.132.240/32,20.73.202.147/32,20.75.144.152/32,20.79.226.26/32,20.79.238.198/32,20.79.239.66/32,20.80.129.80/32,20.93.28.24/32,20.99.255.235/32,4.156.136.79/32,4.182.131.108/32,4.195.133.120/32,4.209.224.56/32,4.213.46.14/32,4.228.76.163/32,40.114.182.153/32,40.114.182.172/32,40.114.182.45/32,40.114.183.196/32,40.114.183.251/32,40.114.183.88/32,40.119.232.146/32,40.119.232.215/32,40.119.232.218/32,40.119.232.251/32,40.119.232.50/32,40.64.105.247/32,40.64.106.11/32,40.76.162.191/32,40.76.162.208/32,40.76.162.247/32,40.76.163.23/32,40.76.163.7/32,40.76.173.151/32,40.80.242.63/32,40.81.250.205/32,40.88.21.235/32,40.89.243.175/32,51.104.146.225/32,51.104.146.235/32,51.104.160.167/32,51.104.160.177/32,51.104.161.32/32,51.104.162.149/32,51.104.163.250/32,51.104.164.109/32,51.104.164.147/32,51.104.164.189/32,51.104.164.215/32,51.104.166.111/32,51.104.167.104/32,51.104.167.110/32,51.104.167.19/32,51.104.167.52/32,51.104.167.54/32,51.104.167.61/32,51.104.167.71/32,51.104.167.87/32,51.104.167.88/32,51.104.167.95/32,51.104.167.96/32,51.104.180.26/32,51.104.180.47/32,51.104.180.53/32,51.107.40.209/32,51.116.131.221/32,51.120.48.122/32,51.138.90.161/32,51.138.90.206/32,51.138.90.233/32,51.8.253.152/32,51.8.71.117/32,52.142.24.149/32,52.142.26.175/32,52.143.241.111/32,52.143.242.6/32,52.143.243.117/32,52.143.244.81/32,52.143.247.235/32,52.143.95.162/32,52.143.95.204/32,52.146.58.236/32,52.146.59.12/32,52.146.59.154/32,52.146.59.156/32,52.146.63.80/32,52.148.161.87/32,52.148.165.38/32,52.149.25.43/32,52.149.28.18/32,52.149.28.83/32,52.149.30.45/32,52.149.56.151/32,52.149.58.139/32,52.149.58.173/32,52.149.58.27/32,52.149.58.69/32,52.149.60.38/32,52.149.61.51/32,52.154.169.200/32,52.154.169.50/32,52.154.170.113/32,52.154.170.117/32,52.154.170.122/32,52.154.170.209/32,52.154.170.229/32,52.154.170.243/32,52.154.170.26/32,52.154.170.28/32,52.154.170.88/32,52.154.170.96/32,52.154.171.0/32,52.154.171.150/32,52.154.171.196/32,52.154.171.205/32,52.154.171.235/32,52.154.171.250/32,52.154.171.44/32,52.154.171.70/32,52.154.171.87/32,52.154.172.2/32,52.154.60.82/32,52.190.37.160/32,52.224.16.221/32,52.224.16.229/32,52.224.19.152/32,52.224.20.174/32,52.224.20.181/32,52.224.20.186/32,52.224.20.190/32,52.224.20.193/32,52.224.20.203/32,52.224.20.204/32,52.224.20.223/32,52.224.20.227/32,52.224.20.249/32,52.224.21.19/32,52.224.21.20/32,52.224.21.23/32,52.224.21.27/32,52.224.21.4/32,52.224.21.49/32,52.224.21.51/32,52.224.21.53/32,52.224.21.55/32,52.224.21.61/32,52.242.224.168/32,57.152.72.128/32,20.8.252.26/32,20.61.142.192/32,20.54.224.39/32,20.13.44.19/32,172.199.55.212/32,132.164.209.198/32,20.166.171.150/32,4.207.220.92/32,68.219.152.220/32,40.127.154.196/32,172.168.115.250/32,172.169.28.184/32,4.150.142.218/32,128.203.132.152/32,64.236.118.43/32,48.217.212.89/32,52.186.37.211/32,52.188.89.106/32,135.234.221.112/32,74.179.232.116/32,4.201.220.8/32,4.201.197.203/32,4.201.206.133/32,4.201.125.59/32,4.201.141.71/32,172.168.53.53/32,172.168.254.119/32,172.168.227.120/32,172.169.60.134/32,4.249.216.104/32,172.169.177.131/32,48.217.23.236/32,4.156.154.107/32,48.217.129.210/32,20.232.51.46/32,4.255.35.121/32,20.253.59.76/32,52.250.46.221/32,172.193.245.229/32,20.112.58.44/32,20.253.96.199/32,40.82.218.203/32,4.237.244.80/32,20.175.232.228/32,4.172.49.103/32,20.250.51.113/32,4.226.40.135/32,20.170.75.54/32,4.182.10.198/32,20.101.17.173/32,20.71.69.210/32,20.216.200.223/32,20.40.147.172/32,20.100.136.36/32,20.100.140.155/32,20.77.146.108/32,20.49.129.236/32,13.86.35.212/32,20.118.11.251/32,18.97.9.168/29,18.97.14.80/29,18.97.14.88/30,98.85.178.216/32,85.208.96.0/24,85.208.97.0/24,85.208.99.0/24,185.170.167.0/24,185.191.171.0/24

IPv6-Adressen:

2607:6bc0::/48,2001:4860:4801:10::/64,2001:4860:4801:12::/64,2001:4860:4801:13::/64,2001:4860:4801:14::/64,2001:4860:4801:15::/64,2001:4860:4801:16::/64,2001:4860:4801:17::/64,2001:4860:4801:18::/64,2001:4860:4801:19::/64,2001:4860:4801:1a::/64,2001:4860:4801:1b::/64,2001:4860:4801:1c::/64,2001:4860:4801:1d::/64,2001:4860:4801:1e::/64,2001:4860:4801:1f::/64,2001:4860:4801:20::/64,2001:4860:4801:21::/64,2001:4860:4801:22::/64,2001:4860:4801:23::/64,2001:4860:4801:24::/64,2001:4860:4801:25::/64,2001:4860:4801:26::/64,2001:4860:4801:27::/64,2001:4860:4801:28::/64,2001:4860:4801:29::/64,2001:4860:4801:2::/64,2001:4860:4801:2a::/64,2001:4860:4801:2b::/64,2001:4860:4801:2c::/64,2001:4860:4801:2d::/64,2001:4860:4801:2e::/64,2001:4860:4801:2f::/64,2001:4860:4801:30::/64,2001:4860:4801:31::/64,2001:4860:4801:32::/64,2001:4860:4801:33::/64,2001:4860:4801:34::/64,2001:4860:4801:35::/64,2001:4860:4801:36::/64,2001:4860:4801:37::/64,2001:4860:4801:38::/64,2001:4860:4801:39::/64,2001:4860:4801:3a::/64,2001:4860:4801:3b::/64,2001:4860:4801:3c::/64,2001:4860:4801:3d::/64,2001:4860:4801:3e::/64,2001:4860:4801:3f::/64,2001:4860:4801:40::/64,2001:4860:4801:41::/64,2001:4860:4801:42::/64,2001:4860:4801:44::/64,2001:4860:4801:45::/64,2001:4860:4801:46::/64,2001:4860:4801:47::/64,2001:4860:4801:48::/64,2001:4860:4801:49::/64,2001:4860:4801:4a::/64,2001:4860:4801:4b::/64,2001:4860:4801:4c::/64,2001:4860:4801:4d::/64,2001:4860:4801:4e::/64,2001:4860:4801:50::/64,2001:4860:4801:51::/64,2001:4860:4801:52::/64,2001:4860:4801:53::/64,2001:4860:4801:54::/64,2001:4860:4801:55::/64,2001:4860:4801:56::/64,2001:4860:4801:57::/64,2001:4860:4801:58::/64,2001:4860:4801:59::/64,2001:4860:4801:60::/64,2001:4860:4801:61::/64,2001:4860:4801:62::/64,2001:4860:4801:63::/64,2001:4860:4801:64::/64,2001:4860:4801:65::/64,2001:4860:4801:66::/64,2001:4860:4801:67::/64,2001:4860:4801:68::/64,2001:4860:4801:69::/64,2001:4860:4801:6a::/64,2001:4860:4801:6b::/64,2001:4860:4801:6c::/64,2001:4860:4801:6d::/64,2001:4860:4801:6e::/64,2001:4860:4801:6f::/64,2001:4860:4801:70::/64,2001:4860:4801:71::/64,2001:4860:4801:72::/64,2001:4860:4801:73::/64,2001:4860:4801:74::/64,2001:4860:4801:75::/64,2001:4860:4801:76::/64,2001:4860:4801:77::/64,2001:4860:4801:78::/64,2001:4860:4801:79::/64,2001:4860:4801:7a::/64,2001:4860:4801:7b::/64,2001:4860:4801:7c::/64,2001:4860:4801:7d::/64,2001:4860:4801:80::/64,2001:4860:4801:81::/64,2001:4860:4801:82::/64,2001:4860:4801:83::/64,2001:4860:4801:84::/64,2001:4860:4801:85::/64,2001:4860:4801:86::/64,2001:4860:4801:87::/64,2001:4860:4801:88::/64,2001:4860:4801:90::/64,2001:4860:4801:91::/64,2001:4860:4801:92::/64,2001:4860:4801:93::/64,2001:4860:4801:94::/64,2001:4860:4801:95::/64,2001:4860:4801:96::/64,2001:4860:4801:97::/64,2001:4860:4801:a0::/64,2001:4860:4801:a1::/64,2001:4860:4801:a2::/64,2001:4860:4801:a3::/64,2001:4860:4801:a4::/64,2001:4860:4801:a5::/64,2001:4860:4801:a6::/64,2001:4860:4801:a7::/64,2001:4860:4801:a8::/64,2001:4860:4801:a9::/64,2001:4860:4801:aa::/64,2001:4860:4801:ab::/64,2001:4860:4801:ac::/64,2001:4860:4801:ad::/64,2001:4860:4801:ae::/64,2001:4860:4801:b0::/64,2001:4860:4801:b1::/64,2001:4860:4801:b2::/64,2001:4860:4801:b3::/64,2001:4860:4801:b4::/64,2001:4860:4801:b5::/64,2001:4860:4801:b6::/64,2001:4860:4801:c::/64,2001:4860:4801:f::/64,2600:1f28:365:80b0::/60

Dazu empfehle ich noch eine weitere Liste von blocklist.de.

#Friendica, #Server, #Bot, #Crawler, #Scraper, #SEMrush, #nftables
www.blocklist.de -- Fail2Ban-Reporting Service (we sent Reports from Attacks on Postfix, SSH, Apache-Attacks, Spambots, irc-Bots, Reg-Bots, DDos and more) from Fail2Ban via X-ARF.

#Cloudflare 09:00: I will protect you from all these scrapers smoking your cpu and webserver bandwidth.

Takes a lunch 🥪

Cloudflare 13:00: Hey AI-model hot boys… I have a 1 click #scraper for you to fill up that model!

https://developers.cloudflare.com/changelog/post/2026-03-10-br-crawl-endpoint/

Crawl entire websites with a single API call using Browser Rendering

Browser Rendering's new /crawl endpoint lets you submit a starting URL and automatically discover, render, and return content from an entire website as HTML, Markdown, or structured JSON.

Cloudflare Docs

New rule: Every time I notice an overnight outage with my website, a new scraper gets added to my robots.txt file.

Welcome to the list, "Amzn-SearchBot".

#Amazon #Scraper #AIScraper