The Internet Archive has decided to ignore robots.txt
https://blog.archive.org/2017/04/17/robots-txt-meant-for-search-engines-dont-work-well-for-web-archives/
This is amazing news for internet history.
a) All public stuff will be crawled. Don't want that? Don't make your shit public.
b) Lapsed domains replaced with parking pages using restricting robots.txt won't prevent old, dead versions of sites from being visible.
Archive everything. #NewLibraryofAlexandria #OCD