UNC-Chapel Hill study shows AI can dramatically speed up digitizing natural history collections – EurekAlert!

News Release 5-Dec-2025

Image: UNC research team check a plant specimen at the UNC Herbarium. view more  Credit: Shanna Oberreiter

UNC-Chapel Hill study shows AI can dramatically speed up digitizing natural history collections, University of North Carolina at Chapel Hill

A new study from UNC-Chapel Hill researchers shows that advanced artificial intelligence tools, specifically large language models (LLMs), can accurately determine the locations where plant specimens were originally collected, a process known as georeferencing. This task has traditionally been slow, expensive and dependent on significant manual effort. The team found that LLMs can complete this work with near-human accuracy while being significantly faster and more cost-effective. 

“Our study explores how large language models can take on one of the biggest bottlenecks in digitizing plant collections,” said Yuyang Xie, first author and postdoctoral researcher in the department of biology at UNC. “We are pioneering the use of these tools for georeferencing, a breakthrough that will accelerate the digitization of plant specimens and unlock new possibilities for ecological research.” 

The research set out to answer a central question: Can AI automate one of the most time-consuming steps in digitizing natural history collections? The Carolina team found out that yes, it can. LLMs not only performed georeferencing with an error margin of less than 10 kilometers, outperforming traditional methods, but also completed the task at a fraction of the time and cost. 

“Recent advances in LLMs can potentially transform the georeferencing process, making it faster and more accurate,” said Xiao Feng, corresponding author and assistant professor in the department of biology at UNC. “This gives researchers unprecedented opportunities to advance our understanding of global biodiversity distributions.” 

The implications are significant. An estimated 2–3 billion herbarium specimens exist worldwide, but only a small fraction have been digitized. Without digital records and spatial data, researchers face major limitations in tracking biodiversity loss, understanding species movement under climate change and analyzing ecosystem shifts. By deploying AI-powered georeferencing, scientists may soon be able to rapidly digitize vast natural history collections that have remained largely inaccessible. 

“This technology allows us to unlock millions of records that are currently sitting in cabinets,” said Xie. “With the power of LLMs, we can rapidly digitize plant specimen data that will be critical for addressing global environmental challenges.” 

Traditional approaches to georeferencing rely on manual interpretation, specialized software, or multiple rounds of expert review. The UNC study is among the first to apply LLMs to this task and to show they can outperform existing methods in accuracy, efficiency, and scalability. This new approach opens the door to digitizing natural history collections at a speed never before possible. 

The research paper is available online in Nature Plants at: https://www.nature.com/articles/s41477-025-02162-y  

Continue/Read Original Article Here: UNC-Chapel Hill study shows AI can dramatically speed up digitizing natural history collections | EurekAlert!

#AI #artificialIntelligence #BiologyDepartment #CarolinaTeam #Collections #DigitizeContent #EurekAlert #Georeferencing #LargeLanguageModelsLLM #LLMs #NaturalHistory #Nature #UNCChapelHill #XiaoFeng #YuyangXie

A prominent computer scientist who has spent 20 years publishing academic papers on cryptography, privacy, and cybersecurity
💥has gone incommunicado, had his professor profile, email account, and phone number removed by his employer, Indiana University, and had his homes raided by the FBI.
⚠️ No one knows why.
#Xiaofeng #Wang has a long list of prestigious titles.
He was the associate dean for research at Indiana University's Luddy School of Informatics, Computing and Engineering,
a fellow at the Institute of Electrical and Electronics Engineers and the American Association for the Advancement of Science,
and a tenured professor at Indiana University at Bloomington.
According to his employer, he has served as principal investigator on research projects totaling nearly $23 million over his 21 years there.

He has also co-authored scores of academic papers on a diverse range of research fields,
including cryptography, systems security, and data privacy, including the protection of human genomic data.
In recent weeks, Wang's email account, phone number, and profile page at the Luddy School were quietly erased by his employer.
Over the same time, Indiana University also removed a profile for his wife, #Nianli #Ma, who was listed as a Lead Systems Analyst and Programmer at the university's Library Technologies division

🔥“None of this is in any way normal” Matthew Green,
a professor specializing in cryptography at Johns Hopkins University, wrote on Mastodon.

He continued:
"Has anyone been in contact? I hear he’s been missing for two weeks and his students can’t reach him.
How does this not get noticed for two weeks???"

In the same thread, Matt Blaze,
a McDevitt professor of computer science and law at Georgetown University, said:
"It's hard to imagine what reason there could be for the university to scrub its website as if he never worked there.
And while there's a process for removing tenured faculty, it takes more than an afternoon to do it."

Local news outlets reported the agents spent several hours moving boxes in an out of the residences.

https://arstechnica.com/security/2025/03/computer-scientist-goes-silent-after-fbi-raid-and-purging-from-university-website/

FBI raids home of prominent computer scientist who has gone incommunicado

Indiana University quietly removes profile of tenured professor and refuses to say why.

Ars Technica