University of North Carolina: UNC-Chapel Hill study shows AI can dramatically speed up digitizing natural history collections. β€œA new study from UNC-Chapel Hill researchers shows that advanced artificial intelligence tools, specifically large language models (LLMs), can accurately determine the locations where plant specimens were originally collected, a process known as georeferencing. This […]

https://rbfirehose.com/2026/02/05/university-of-north-carolina-unc-chapel-hill-study-shows-ai-can-dramatically-speed-up-digitizing-natural-history-collections-2/
University of North Carolina: UNC-Chapel Hill study shows AI can dramatically speed up digitizing natural history collections

University of North Carolina: UNC-Chapel Hill study shows AI can dramatically speed up digitizing natural history collections. β€œA new study from UNC-Chapel Hill researchers shows that advance…

ResearchBuzz: Firehose
til #openstreetmap #googlemaps re just for visualization you can't rely on them for serious analysis
Coordinate Reference System (CRS) is inappropriate
when it introduces unacceptable distortions, misaligns data, or leads to incorrect measurements. The "best" CRS depends entirely on the spatial extent of the data (local vs. global) and the analysis being performed (distance vs. area vs. visualization
#georeferencing
https://epsg.io/
https://www.youtube.com/watch?v=Nm4UbPLEgK4&list=PL6L1mY6cDuDByQXxp1Z80u_raec49nX_P
EPSG.io: Coordinate Systems Worldwide

EPSG.io: Coordinate systems worldwide (EPSG/ESRI), preview location on a map, get transformation, WKT, OGC GML, Proj.4. https://EPSG.io/ made by @klokantech

Georeference a Scanned Map in QGIS

PeerTube

UNC-Chapel Hill study shows AI can dramatically speed up digitizing natural history collections – EurekAlert!

News Release 5-Dec-2025

Image: UNC research team check a plant specimen at the UNC Herbarium. view more  Credit: Shanna Oberreiter

UNC-Chapel Hill study shows AI can dramatically speed up digitizing natural history collections, University of North Carolina at Chapel Hill

A new study from UNC-Chapel Hill researchers shows that advanced artificial intelligence tools, specifically large language models (LLMs), can accurately determine the locations where plant specimens were originally collected, a process known as georeferencing. This task has traditionally been slow, expensive and dependent on significant manual effort. The team found that LLMs can complete this work with near-human accuracy while being significantly faster and more cost-effective. 

β€œOur study explores how large language models can take on one of the biggest bottlenecks in digitizing plant collections,” said Yuyang Xie, first author and postdoctoral researcher in the department of biology at UNC. β€œWe are pioneering the use of these tools for georeferencing, a breakthrough that will accelerate the digitization of plant specimens and unlock new possibilities for ecological research.” 

The research set out to answer a central question: Can AI automate one of the most time-consuming steps in digitizing natural history collections? The Carolina team found out that yes, it can. LLMs not only performed georeferencing with an error margin of less than 10 kilometers, outperforming traditional methods, but also completed the task at a fraction of the time and cost. 

β€œRecent advances in LLMs can potentially transform the georeferencing process, making it faster and more accurate,” said Xiao Feng, corresponding author and assistant professor in the department of biology at UNC. β€œThis gives researchers unprecedented opportunities to advance our understanding of global biodiversity distributions.” 

The implications are significant. An estimated 2–3 billion herbarium specimens exist worldwide, but only a small fraction have been digitized. Without digital records and spatial data, researchers face major limitations in tracking biodiversity loss, understanding species movement under climate change and analyzing ecosystem shifts. By deploying AI-powered georeferencing, scientists may soon be able to rapidly digitize vast natural history collections that have remained largely inaccessible. 

β€œThis technology allows us to unlock millions of records that are currently sitting in cabinets,” said Xie. β€œWith the power of LLMs, we can rapidly digitize plant specimen data that will be critical for addressing global environmental challenges.” 

Traditional approaches to georeferencing rely on manual interpretation, specialized software, or multiple rounds of expert review. The UNC study is among the first to apply LLMs to this task and to show they can outperform existing methods in accuracy, efficiency, and scalability. This new approach opens the door to digitizing natural history collections at a speed never before possible. 

The research paper is available online in Nature Plants at: https://www.nature.com/articles/s41477-025-02162-y  

Continue/Read Original Article Here: UNC-Chapel Hill study shows AI can dramatically speed up digitizing natural history collections | EurekAlert!

#AI #artificialIntelligence #BiologyDepartment #CarolinaTeam #Collections #DigitizeContent #EurekAlert #Georeferencing #LargeLanguageModelsLLM #LLMs #NaturalHistory #Nature #UNCChapelHill #XiaoFeng #YuyangXie

University of North Carolina: UNC-Chapel Hill study shows AI can dramatically speed up digitizing natural history collections. β€œA new study from UNC-Chapel Hill researchers shows that advanced artificial intelligence tools, specifically large language models (LLMs), can accurately determine the locations where plant specimens were originally collected, a process known as georeferencing. This […]

https://rbfirehose.com/2025/12/06/university-of-north-carolina-unc-chapel-hill-study-shows-ai-can-dramatically-speed-up-digitizing-natural-history-collections/

Glancing through latest activity on OldInsuranceMaps.net, a rare but good example (from Worcester Mass.) of where Thin Plate Spline is a good transformation for #georeferencing #sanborn maps. It's these curved roads out in space that are sometimes drawn without good scale and need this extra level of distortion https://oldinsurancemaps.net/layer/72727
#Georeferencing #historical #maps by their coordinates is not without its pitfalls. A location in LAT / LON can only be intrepreted with spatial context. While there are some historical proj-strings out there (e.g. DHDN, EPSG:4314), one needs to calculate them for other historical spatial reference systems (SRS) using identical coordinates. Thanks to @jjimenezshaw there is now an accessible solution in #python 🀩 πŸ‘‡
https://github.com/jjimenezshaw/helmert-calc/ @fidkarten @historicum_net @DHd @oldmapgallery
GitHub - jjimenezshaw/helmert-calc: Calculator for Helmert parameters

Calculator for Helmert parameters. Contribute to jjimenezshaw/helmert-calc development by creating an account on GitHub.

GitHub
Crowdsourcing project

ETH Library
Finally wrote up a blog post about the georeference-a-thon we held at UIUC last November for GIS day: https://healthyregions.org/2025/01/31/gis-day-2024-community-georeferencing-of-sanborn-fire-insurance-maps/. We had a great turnout, especially from different parts of campus outside the geog dept. #sanbornmaps #georeferencing #crowdsourcing #OldInsuranceMaps
GIS Day 2024 – Community Georeferencing of Sanborn Fire Insurance Maps

This past GIS Day, November 20th, 2024, we at the Healthy Regions & Policies Lab hosted a β€œgeoreference-a-thon,” and created historical map mosaics of Champaign-Urbana in 1915.

Healthy Regions & Policies Lab