The Internet Archive has decided to ignore robots.txt
https://blog.archive.org/2017/04/17/robots-txt-meant-for-search-engines-dont-work-well-for-web-archives/
This is amazing news for internet history.
a) All public stuff will be crawled. Don't want that? Don't make your shit public.
b) Lapsed domains replaced with parking pages using restricting robots.txt won't prevent old, dead versions of sites from being visible.
Archive everything. #NewLibraryofAlexandria #OCD
"
This year especially there’s an uncomfortable feeling in the tech industry that we did something wrong, that in following our credo of “move fast and break things”, some of what we knocked down were the load-bearing walls of our democracy."
If you care one bit about what tomorrow's internet is going to look like, please go and read Maciej Ceglowski's great talk about how we can "Build a better monster":
The irony of #Firefox crashing right when I was reading this article claiming they reduced by 10% the crashes due to graphic glitches.
https://blog.mozilla.org/blog/2017/04/19/first-big-bytes-project-quantum/