π Breaking news: Search engine crawlers are taking longer coffee breaks than your average office worker! π’ Apparently, shrinking memory means faster crawling... except when it doesn't, because the last 0.1% is on vacation for a week! π
Clearly, Pareto forgot to distribute common sense! π
https://www.marginalia.nu/log/a_117_crawl_order/ #SearchEngines #CoffeeBreaks #TechHumor #CrawlingChallenges #ParetoPrinciple #MemoryManagement #HackerNews #ngated
https://www.marginalia.nu/log/a_117_crawl_order/ #SearchEngines #CoffeeBreaks #TechHumor #CrawlingChallenges #ParetoPrinciple #MemoryManagement #HackerNews #ngated
Crawl Order and Disorder
A problem the search engineβs crawler has struggled with for some time is that it takes a fairly long time to finish up, usually spending several days wrapping up the final few domains. This has been actualized recently, since the migration to slop crawl data has dropped memory requirements of the crawler by something like 80%, and as such Iβve been able to increase the number of crawling tasks, which has led to a bizarre case where 99.