A little htaccess kung-fu later, and if everything works, GPT bot is blocked now.
Put the Amazonbot on it as well, since it was creeping around on my website.

#webmaster #aibots #aibullshit #openai #amazonbot

I had to write a report to #Amazon today. A server was getting a large number of unrelated requests and Amazonbot became the top user-agent with a collection of odd requests. Nope, it isn't a kickball or gardening site. Nope, we don't have an article about buying your first home. Nope, it doesn't have files from the https://github.com/abpframework/abp/ repo. I wrote a fail2ban filter for that last one which caught 426 IPs. #Amazonbot seems completely broken from here and I added it to the robots.txt file.
GitHub - abpframework/abp: Open-source web application framework for ASP.NET Core! Offers an opinionated architecture to build enterprise software solutions with best practices on top of the .NET. Provides the fundamental infrastructure, cross-cutting-concern implementations, startup templates, application modules, UI themes, tooling and documentation.

Open-source web application framework for ASP.NET Core! Offers an opinionated architecture to build enterprise software solutions with best practices on top of the .NET. Provides the fundamental in...

GitHub
I've had a reply from Amazon about this, and it's clearly from someone who didn't read my report. They mention that it abides by robots.txt, which indeed, it does, but blocking Amazonbot via robots.txt is a blunt tool.
It would of course be more sensible if their crawler doesn't dream up URL parameters that don't exist and have never been announced. #AmazonBot

Today in web application hosting land, I find the Amazon's Amazonbot that is crawling to train the LLM that is Alexa seems to be using AI spew to generate potential crawling target URL, rather than proper crawler behaviour and only following actual links that have been offered.

It's been hammering a Wordpress events calendar site with concurrent requests from multiple networks for events in the year 2271.

#AmazonAlexa #Amazonbot #webdev

Had to block #claudebot and #Amazonbot from one of my #phpbb forums, causing serious load.

I don’t add web crawlers to my shitlist often, but #amazonbot just made the cut. It has been issuing 1 request/s to pages behind “nofollow” for quite a few minutes. Based on the about page for the bot, #Amazon didn’t bother implementing robots.txt rate limiting or honoring nofollow in a meta tag. 🤦‍♂️

From now on, it’ll get 403s.

https://developer.amazon.com/support/amazonbot

τελευταία τεχνικά νέα :

#modoboa 2.2.0 από χθες στο email hosting, με πολλές αλλαγές. σύντομα και send-only emails. δυστυχώς ένα συγκεκριμένο bug που μας ενοχλούσε, παραμένει.

#searx αφαιρέθηκε τελείως. δεν υπήρχε χρόνος για συντήρηση και το debian πακέτο δεν ενημερώνεται όπως θα έπρεπε. και χρήστες είχε κυρίως bots με στοχευμένα queries για χειραγώγηση των πραγματικών μηχανών αναζήτησης. (warez, κα)

#ai #bots + #amazonbot μπλοκαρίστηκαν τελείως από τα web hosting μηχανήματα μας.

#Amazonbot. Really aggressive. Do wonder of the value tbh.

I have asked Alexa questions and had the site returned which was nice, but outside of that, where’s the click? Where’s the sign up? Where’s the sale? Where’s the ad impression?

ASN blocked.