It had to happen, eventually. My AI crawler antagoniser, https://www.ty-penguin.org.uk/~auj/spigot/ has been seeing sustained traffic of between 300 and 500 thousand hits per hour. I've not been particularly bothered by that, but a couple of days ago, my provider, @bitfolk, sent me a bandwidth warning: I'm on track to hit 2TBytes of outbound bandwidth this month and end up paying for the excess.
So I've added firewalling - if more than 5% of machines in a /23 network hit spigot within an hour, then the entire network gets a temporary block until it completely stops hitting my server. Hopefully that will cut things back enough to avoid charges.
The thing that amazes me is that the list has already accumulated nearly 10,000 entries. Put another way, I'm already blocking 0.12% of the whole IPV4 address space because it's being used for web crawling.
