Wondered why our latest podcast episode didn’t show up on https://workingdraft.de this morning. In our headless WP we preschedule releases and @11ty builds the front facing site daily. Turns out an AI bot broke the build: our log-parsing stats step choked on its UA string:

Mozilla/5.0 (compatible; Thinkbot/0.5.8; +In_the_test_phase,_if_the_Thinkbot_brings_you_trouble,_please_block_its_IP_address._Thank_you.)

"if_the_Thinkbot_brings_you_trouble" 🖕

Working Draft

Wöchentlicher Podcast für Frontend Devs, Design Engineers und Web-Entwickler:innen

@Schepp @11ty 🤦‍♂️
@heydon @Schepp @11ty haha this bot also went straight into my honeypot*… repeatedly.
* a directory on my website that only is mentioned in the robots.txt with a disallow and not linked anywhere.
so this motherboardfucker (excuse my french) is actually looking in the robots.txt but then sees a disallow as an invite
@webrocker @heydon @Schepp @11ty I had an interesting AI-encounter the other day about which rules AIs obey. Maybe I need to write a blog article about it…
@MoritzGlantz @heydon @Schepp @11ty Inspired by this incident I have now completed my feeble defense against those bots that visit my hidden directory. Their IP is saved in a nosql and the single entry to my website checks the current vistor's IP against that nosql and returns a 403 if the IP matches. I sucessfully logged me out of my website by visiting my hidden dir. jay.

@webrocker Why not send them a multigigabyte file to crawl? Or pollute them with Heydon's script?

@MoritzGlantz @heydon @Schepp @11ty

@Lippe @webrocker @heydon @Schepp @11ty A zip bomb maybe? 🤔

@MoritzGlantz Thought of this, but will they at all crawl zip files?

@webrocker @heydon @Schepp @11ty

@Lippe @MoritzGlantz @heydon @Schepp @11ty well if I don't want them to waste the resources on my site, how would serving a gazillion of data help?
@Lippe @webrocker @heydon @Schepp @11ty Fight fire with fire! ✊