Mastodawn

Now the Swedish National Archives talking about their journey on making the archives more accessible using #htr and #ml #lion

Show thread

STOPDISINFORMATION 3d ago

Objetivo: Manejar al menos una plataforma de #HTR (p. ej. #Transkribus o #eScriptorium): cargar documentos, lanzar reconocimiento, revisar y corregir resultados.
Entender los conceptos básicos de entrenamiento de modelos: conjuntos de entrenamiento/validación, overfitting, tasas de error y #métricasdecalidad (CER/WER)
Fomentar el trabajo interdisciplinar entre especialistas en humanidades, informática y #archivo / #biblioteca, articulando roles y tareas en proyectos colaborativos.

Show thread

STOPDISINFORMATION 3d ago

Ponentes
José Manuel Fradejas Rueda
Catedrático Lengua Española #UniversidaddeValladolid
Francisco Gago Jover
Professor of Spanish, College of the Holy Cross (EEUU)
Objetivos
Comprender qué es el #HTR, en qué se diferencia del #OCR y en qué contextos resulta útil (patrimonio #archivosadministrativos, investigación, etc.)
Conocer el ciclo completo de trabajo con HTR: obtención imágenes del manuscrito, exportación de transcripciones, y los diferentes tipos de formatos estructurados reutilizables.

Show thread

STOPDISINFORMATION 3d ago

#HTR abre la puerta a proyectos de edición digital, corpus históricos y análisis lingüístico a gran escala, que antes eran inviables por el coste en tiempo de la transcripción manual. Para #archivosdeempresas y administraciones, la automatización de la lectura de documentos manuscritos incrementa de forma notable la productividad y reduce errores de entrada manual de datos. Entender las diferencias conceptuales y técnicas entre OCR tradicional y HTR neuronal

Show thread

d'aïeux et d'ailleurs 5d ago

#photomay2026 #photomai2026

27 - Écriture

Reconnaissance automatique d'écriture manuscrite (HTR)...
J'en reste sans voix.

#viedarchiviste #htr #HandwrittenTextRecognition #archives #workinprogress

David Flood May 22

I just updated my Greek Minuscule LLM HTR Benchmark and added the latest frontier models in addition to some open weight and other models.

Some spoilers: Gemini is still far ahead for all metrics. Grok and Llama are about as bad as it gets.

https://d-flood.github.io/paleo-bench/

#DigitalHumanities #htr

Paleo Bench | HTR Model Leaderboard

Compare handwritten Greek text recognition performance across LLM vision models with ranking, CER/WER quality, latency, and cost metrics.

Annette von Stockhausen Apr 22

Bookmarked: eScriptorium https://escriptorium.eu/blog/2026-04-15-a-new-chapter-for-escriptorium/ #DH #HTR #Handschriften A Digital Text Production Pipeline for Print and Handwritten Texts using machine learning techniques.

A New Chapter for eScriptorium

In the next few months you will likely be seeing a new version of eScriptorium come to your instance. If you’re using INRIA, you may have already seen the changes. This year has brought a new milestone with major additions and improvements to eScriptorium. These updates were so significant that we had even started naming this release ‘version 1.0’, (and they are even so significant that starting version 1.0, we’ll change our naming scheme, see below).

eScriptorium

Bibelexegese Apr 22

eScriptorium became an important tool in our workflow while preparing critical editions ... #htr #atr #manuscripts #DigitalClassics https://escriptorium.eu/blog/2026-04-15-a-new-chapter-for-escriptorium/

A New Chapter for eScriptorium

eScriptorium

David Haskiya Apr 22

Proud that we did so well in ICDAR's Competition on Multilingual Medieval Handwriting Recognition! 🥈🥉🥉

https://cmmhwr26.inria.fr/results/

Kudos especially to Viktoria Löfgren in our team for her work on this during our internal hackdays.

We hope to do more substantial work on HTR and NER for medieval Swedish, and possibly Latin, in the coming years.

#HTR

Results

Results Of the 26 registered teams, 12 submitted results, with all 12 participating in Task 1, 9 in Task 2, and 9 in Task 3. Three teams (Qianfan-OCR, STUDIUM.AI, and nampfiev1995) submitted only to Task 1. Over 300 individual submissions were recorded across the three tasks. Participants were permitted to use proprietary methods as well as additional public or non-public data beyond the competition training set. The organizers’ baseline was obtained using the kraken OCR engine with the CATMuS Medieval 1.6 model, a general-purpose recognition model for medieval Latin-script manuscripts. No further adaptation or optimization was applied for any of the tasks.

ICDAR 2026 Competition on Multilingual Medieval Handwriting Recognition