Giant Corporations™ are scraping my little git server to feed their ever-hungry, planet-destroying plagiarism machines.

So now, instead of getting my code, they get a 10GB treat.

Fucking THIEVES.

edit: This was inspired-by-and-based-on this post https://rknight.me/blog/blocking-bots-with-nginx/

Blocking Bots with Nginx

How I've automated updating the bot list to block access to my site

@j Another good trick is when they try to request images, to feed them poisoned images that hurt their dataset lol

A 10GB file will likely never been downloaded in full, but a poisoned image will very likely make it into the model, ruining their efforts ;)