Introducing Karmashapes – open data for towns and villages in India available for download from #whosonfirst!

https://whosonfirst.org/blog/2023/06/19/introducting-karmashapes/

Includes 915,468 locality records and 190,397 neighbourhoods records (with 579,116 detailed polygons), thanks to @MapMakinMeyers and @stepps00

Introducing Karmashapes

Thanks to the Karmashapes initiative, Who’s On First now provides the best open data for towns and villages in India.

Who's On First
As anyone who’s searched for Indian map data knows, the sheer number of people and populated places, combined with the historic paucity of openly licensed and/or authoritative government data, has presented significant challenges.
India recently surpassed China to become the most populous country on Earth w/ 1.4 billion people. Around 65% of people in India are spread out in hundreds of thousands of rural villages with less than 5,000 people; and 35% urbanized are concentrated in very dense towns & cities

Let’s take a look at the locality and neighbourhood coverage now available in WOF:

Locality (blue polygons and blue dots) and neighbourhood (orange lines and orange dots) coverage along the northern Kerala coast

Locality (blue polygons and blue dots) and neighbourhood (orange lines and orange dots) coverage northwest of Dehradun
Locality (blue polygons and blue dots) and neighbourhood (orange lines and orange dots) coverage in and around Chandigarh. Note some neighbourhood polygons in Chandigarh were curated pre-Karmashapes import and have been retained.
This build wasn’t possible until 2022, when five openly-licensed datasets were each available and combined to create Karmashapes; this included three vector datasets and two raster datasets.
In 2022, the Indian Ministry of Rural Development (PMGSY) openly released its ‘Habitations’ dataset. This was the first localities dataset released in India with a clear and friendly license.
Geospatial census data from Columbia University’s Socioeconomic Data and Applications Center (SEDAC) for all Indian villages helped confirm and verify the locations in the PMGSY dataset.
Lastly, a vector dataset from the Geographic Names Server created by the National Geospatial-Intelligence Agency (NGA) was used to cross-reference PMGSY and SEDAC data.
Once a single cleaned and verified vector dataset was created, “unique candidate” urban area polygons were synthesized from two additional global raster datasets: the World Settlement Footprint (WSFEVO) Evolution from the German Aerospace Center (DLR) at 10m resolution and ...
... a global map of land use/land cover (LULC) derived from the European Space Agency’s (ESA) Sentinel-2 imagery, also at 10m resolution. These raster layers were converted to two additional vector datasets, which were used for an additional cross-reference for the primary data
Numerous geoprocessing tools in QGIS were used to analyze and combine these five datasets, resulting in composite polygons and points for habitations and villages in India.
Each of these input datasets helped to create detailed feature attributes of Karmashapes, including:
- Population count
- Population vintage
- Name (Latin-based)
- Name variants (Latin-based + limited local script names for big towns in Hindi and other localse)
- Label centroid
Collaborating with data developers and contributors to each of these projects was very helpful. The insight and knowledge they shared helped increase Karmashape’s awareness of the local administrative structure, naming conventions and regional name spellings/localizations.
It took dozens of conversations with people in the #datameet group, colleagues from @openstreetmap (OSM), and friendly strangers on Twitter (you know who you are 💗) to help answer detailed questions about rural areas and the various megacities of India. Thank you!

- Browse WOF records in the Spelunker

https://spelunker.whosonfirst.org/placetypes/locality/?&iso=in

- Download Who’s On First data in Shapefile and SQLite formats, via Geocode.Earth:

https://geocode.earth/data/whosonfirst/

- Learn more about downloading Who’s On First data in various formats

https://whosonfirst.org/blog/2023/05/31/shapefiles/

Who's On First Spelunker | | locality

Want the raw Karmashapes files? Full Karmashapes downloads are available on Justin’s blog.

https://karmashapes.wordpress.com/2023/03/05/karmashapes-v0/

karmashapes v0

Points Download: Polygons Download: Sources: Ministry of Rural Development, 2022. PMGSY Rural Connectivity Datasets,  Published under India’s Government Open Data License:  India Village-…

karmashapes

🇮🇳 There are more details (and a few next steps listed) in the full blog post:

https://whosonfirst.org/blog/2023/06/19/introducting-karmashapes/

We're excited to see what you build with this new #opendata for #india, cheers.

Introducing Karmashapes

Thanks to the Karmashapes initiative, Who’s On First now provides the best open data for towns and villages in India.

Who's On First