What say we run 'file' and #siegfried against #ApacheTika's 600k 'application/octet-stream's in the most recent #CommonCrawl crawl?
Anyone else want to join in the fun?
What say we run 'file' and #siegfried against #ApacheTika's 600k 'application/octet-stream's in the most recent #CommonCrawl crawl?
Anyone else want to join in the fun?