I guess this "Docker-Hub pay-to-play" thing that I'm seeing inconsistent (and unsourced) complaints about this morning is another example of the "common index" problem.

i.e., public services that don't do the creation & hosting of information tend to go without scrutiny for a lot longer than services that are primarily repositories for big storage, but in the end they're just as important.

...

e.g., Flickr hosts all your photos? FOSS community immediately notices & people rush to put together replacements.

Gpodder indexes all the podcasts even though the files live elsewhere? That's not perceived as urgent.

(The replacements may not all be good ones; that's orthogonal)

I think there are plenty of other examples of this fragility, such as free 3D model searching and services like that ADS (airplane call-sign) aggregator site or various weather-station front-ends.

And yes there's definitely a gray area where the distinction between indexing-only and data-hosting lies; definitely depends on what you regard as a "big" file. And if that's changed since last year.

But still. A #commonindex matters a lot & shouldn't be taken for granted when it's running smoothly.

I also wonder if the Docker Hub index could be bulk-scraped.

I definitely don't pay close enough attention to know what portions of that are worth archival preservation anyway.

@n8 anything that is an index + lots of storage needs a bit torrent backend
@paperdigits And a "works when the internet is offline" backend!