@valarauca1 uhhh... i've only used kafka as a message passing system. i guess you could use it as a relational database? it's pretty neat though, definitely heard of it used under high throughput
@iximeow okay that's one scheme. There are a few others. It is actually only some ~10GiB of data. I'm thinking I just need to host it all in RAM and write my analysis tool.
@iximeow grepping across all files in BigQuery file contents is only ~3TiB <$40 they give you data size estimate when you type in a valid query you can throw in the cost calculator.
@iximeow BQ charges for network transfer, not in place filtering/joining/hashing. But it's computation is STUPID SLOW. Doing a 1000 item cross join will take >20 hours.