one of the reasons it's useful that git-pages is so efficient storage-wise is that i don't actually need to worry about what people are uploading. there are no cases where reasonable-seeming use of the website would generate problematic amounts of resource use, so i don't have to watch over resource use. it's Fine.
someone uploaded two copies of a site containing Every Manpage, which is half a gigabyte (each)? it only costs 36 MB to store it (total). it's Fine.
@whitequark is this compression or just file-level deduplication (or maybe block-level?)
@solonovamax compression and file-level deduplication across the whole service instance
@solonovamax block-level deduplication is something that's on the table but doesn't seem necessary from the data so far
@whitequark block level deduplication just also sounds annoying to do
@solonovamax @whitequark I'm working on something using block level deduplication and I can confirm it's annoying.
@solonovamax it's mostly just that figuring out where the block boundaries are is a real pain; integration with the compressor pays off but i'm unfamiliar with the domain