#Geoparsing is about identifying and locating place references in texts. Does it matter which language is being geoparsed? In a new @ijgis article, we examine that question through building a geoparser for the morphologically complex Finnish. Co-authored with my wonderful supervisors @tuuli & @tuomo

The article: https://doi.org/10.1080/13658816.2024.2369539
🧵

Gotta give it up to the wicked smart @tadusko on his fresh article on #geoparsing out now on #IJGIS. Not only is it a highly interesting piece on how to make geoparsing better overall and specifically for morphologically complex languages like Finnish, but it is also well-written. A pleasure to read overall.

Check it out here https://doi.org/10.1080/13658816.2024.2369539

How can we find and locate place names from texts? How can we do it for Finnish?

Results from my #PhD and the @digigeolab #MOBICON project are starting to trickle out! A paper out soon, but the OS #Python tool for Finnish "geoparsing" is available here https://github.com/Tadusko/fi-geoparser

Now to the fun part: writing documentation...

#geoparsing

GitHub - Tadusko/fi-geoparser: Geoparser for extracting locations from Finnish texts

Geoparser for extracting locations from Finnish texts - Tadusko/fi-geoparser

GitHub

Next: Harri Kiiskinen, Asko Nivala, Jasmine Westerlund, and Juhana Saarelainen (2023). “Extracting Geographical References from Finnish Literature. Fully Automated Processing of Plain-Text Corpora”. In: Journal of Computational Literary Studies 2 (1). doi: https://doi.org/10.48694/jcls.3584.

Keywords: named entity recognition, geographic information system, #geoparsing linked open data, literary geography, #Finland

#JCLS #CLS #LOD #NER

Extracting Geographical References from Finnish Literature. Fully Automated Processing of Plain-Text Corpora

In the Atlas of Finnish Literature 1870-1940 project, we extract geographical information from a Finnish-language corpus of literary texts published between 1870 and 1940. The texts are transformed from plain texts to TEI/XML, and further processed with named entity recognition and linking tools. The results are presented in a web-based environment. This article describes the technical structure of the analysis chain, the tools used and the metaprocesses used to manage the research dataset.

Journal of Computational Literary Studies

#30DayMapChallenge

Day 30 - "My favorite... lyricist!"

The beloved Finnish singer-songwriter Juice Leskinen takes us on a world-tour with words. We mapped the geography of Juice by locating the place names in his lyrics with #geoparsing.

By: @tadusko and @matabatchi