One question for #webarchiving experts: I've identified 30 WARCs in our collections that contain the highest number of records declared as a certain MIME type I'm interested in.
I've requested the extraction of these AIPs, and hopefully I'll soon have access to the 30 gzipped WARC. Then I'll have to extract from these the records that have the said MIME type in the HTTP header, and turn then into a stand-alone file. How would you perform these operations?