It had to happen, eventually. My AI crawler antagoniser, https://www.ty-penguin.org.uk/~auj/spigot/ has been seeing sustained traffic of between 300 and 500 thousand hits per hour. I've not been particularly bothered by that, but a couple of days ago, my provider, @bitfolk, sent me a bandwidth warning: I'm on track to hit 2TBytes of outbound bandwidth this month and end up paying for the excess.

So I've added firewalling - if more than 5% of machines in a /23 network hit spigot within an hour, then the entire network gets a temporary block until it completely stops hitting my server. Hopefully that will cut things back enough to avoid charges.

The thing that amazes me is that the list has already accumulated nearly 10,000 entries. Put another way, I'm already blocking 0.12% of the whole IPV4 address space because it's being used for web crawling.

An infinite maze of twisty little pages

Well that seems to have annoyed them.

Over the past few hours, request rates have been ramped up to nearly 900,000 hits per hour from nearly 700,000 distinct IP addresses. This is not including the many thousands that are firewalled off, but still trying their best. I'm turning page generation off for a bit while I ponder what to do next.

@pengfold tarpitting?
@avatastic already doing that for the worst offenders.