#Development #Obituaries
Farewell to robots.txt (1994-2025) · “You were too good for this world.” https://ilo.im/167q2b
_____
#SearchEngine #InternetArchive #AI #Content #Website #RobotsTxt #RFC9309 #WebDev #Backend
#Development #Obituaries
Farewell to robots.txt (1994-2025) · “You were too good for this world.” https://ilo.im/167q2b
_____
#SearchEngine #InternetArchive #AI #Content #Website #RobotsTxt #RFC9309 #WebDev #Backend
I’ve made a little something, so I thought I'd share.
Gort is a robots.txt parser and evaluator. It implements RFC 9309.
More details in the ReadMe: https://github.com/pointlessone/gort
Setting up /robots.txt, not because it helps, but because being crabby in compliance with an RFC is satisfying.
Who has some unsavory ones besides ChatGPT and Twitterbot?
A set of reusable Java components that implement functionality common to any web crawler - GitHub - crawler-commons/crawler-commons: A set of reusable Java components that implement functionality c...