@announcements

Blocking Aggressive Crawlers

I just updated the blocklist for the Mastodon and BookWyrm instances. The logs showed meta-webindexer getting stuck in ridiculous infinite /rss/rss/ loops — a total waste of server resources pushing BookWyrm to 100% CPU utilization.

Additional Bans:

Meta-Webindexer/ExternalAgent: Recursive scraping gone rogue.

ClaudeBot: Keeping local content out of AI training sets.

Semrush & SERanking: Commercial SEO bots have no business here.

If you’re self-hosting and notice weird CPU spikes or odd path patterns, I highly recommend auditing your User Agents. It’s an easy way to protect your performance and your users' privacy. 🛠️

#SelfHosted #MastodonAdmin #BookWyrm #Nginx #Privacy #SysAdmin

@Moritz @announcements single user instance here had the same bots hammering away so i added a redirect to my ai tarpit server running https://iocaine.madhouse-project.org/
iocaine - the deadliest poison known to AI

@anne

Cool, I need to try that, thanks for sharing I've noticed the crawlers already backing off after seeing 4xx for a few minutes.