https://overturemaps.org/ released an astonishing GIS dataset yesterday that includes 60m "place of interest" listings (businesses, attractions etc) under a VERY permissive license

It's 8GB of data and the quality from an initial spot-check seems to be very high. I wrote about how I've been exploring it so far here:

https://til.simonwillison.net/overture-maps/overture-maps-parquet

Home - Overture Maps Foundation

Overture Maps Foundation
@simon kinda sucks that you need to download the whole thing. Maybe using the Athena or Azure routes would allow faster selects than DuckDB?

@seav You don't have to download the whole thing for a bunch of operations - but the "find places within this bounding box" thing does seem to be too much for the remote HTTP mechanism to handle quickly

A problem I have is that I don't have good instincts yet for figuring out if a query is likely to work well over remote Parquet or not

@simon @seav This is awesome! There are some tricks we can use to structure the parquet files that allow more efficient bounding box queries using remote predicate pushdown. It’s all pretty new for spatial parquet data. We’ll probably look at this for future releases along with easier country/region partitioning.
@jwass2000 @[email protected] That would be fantastic - I'm very new to Parquet/DuckDB myself so any extra documentation from Oversight illustrating the kinds of queries you can run against it without downloading GBs of data would be really useful