Mastodawn

Sehr schöne Präsentation zu automatischer Metadatenextraktion aus einem Korrespondenzkorpus von Sabrina Strutz (Graz).

Sorgfältige Arbeit/Evaluation mit kritischer Durchsicht der GT Daten (welche Informationen stecken aus der Printedition da drin, die aber aus dem Brief gar nicht entnommen werden können?), Aufschlüsselung von Ergebnisqualität nach Task (Autor-/Ortserkennung) und Phase (Erzeugung von Kandidaten und Bestimmung des endgültigen Vorschlags).

Qwen3-14B-Q6 als lokales Modell zwar schlechter als Sonnet 4.6 (welches sehr gute Ergebnisse liefert, aber auch am teuersten ist) und GPT 5.2, aber auch keine ganz schlechten Ergebnisse. (Und besser mit abgeschaltetem Reasoning!)

Alle Modelle haben Probleme, Schreibeorte aus dem Text zu erschließen, wenn sie nicht in der Datumszeile genannte werden.

#DHd2026 #InformationExtraction #LLM

sayzard Feb 1

Python Trending (@pythontrending)

GLiNER2 발표: 'Unified Schema-Based Information Extraction' 라는 제목의 GLiNER2는 스키마 기반으로 통합된 정보 추출(Information Extraction) 접근을 제안하는 연구/도구로 보입니다. 스키마 일관성 유지와 구조화된 추출을 목표로 하는 NLP/IE 관련 신기술·연구 발표입니다.

https://x.com/pythontrending/status/2017605023694197069

#gliner2 #informationextraction #nlp #schema #ie

Python Trending 🇺🇦 (@pythontrending) on X

GLiNER2 - Unified Schema-Based Information Extraction https://t.co/RbXW066WM0

X (formerly Twitter)

sayzard Jan 15

Python Trending (@pythontrending)

AI Powered Knowledge Graph Generator(ai-knowledge-graph)는 텍스트나 구조화/비구조화된 데이터로부터 자동으로 지식 그래프를 생성하는 도구로, 정보 구조화·엔티티 연결·검색 개선 등에 활용 가능한 AI 기반 지식 그래프 생성 오픈소스 프로젝트입니다.

https://x.com/pythontrending/status/2011476842717270448

#knowledgegraph #nlp #informationextraction #graph

Python Trending 🇺🇦 (@pythontrending) on X

ai-knowledge-graph - AI Powered Knowledge Graph Generator https://t.co/ZKxPLVZPao

X (formerly Twitter)

N-gated Hacker News Dec 21

🤖🔧 Apparently, structured outputs are the latest "sliced bread" of #AI, but turns out they're just fancy-shmancy wrappers that make your LLM dumber than a bag of hammers 🤦‍♂️. Who knew that squeezing responses into neat little boxes could actually lead to a train-wreck of information extraction? 🚂💥
https://boundaryml.com/blog/structured-outputs-create-false-confidence #Innovation #AI #Limitations #LLMs #InformationExtraction #TechHumor #StructuredOutputs #HackerNews #ngated

Structured Outputs Create False Confidence

Constrained decoding seems like the greatest thing since sliced bread, but it forces models to prioritize output conformance over output quality.

BAML

Reddit Tech VN Bot Nov 25

GLiNER2: Hệ thống trích xuất thông tin thống nhất dựa trên lược đồ. Hiệu suất cạnh tranh với mô hình ngôn ngữ lớn, chạy hiệu quả trên CPU. #GLiNER2 #TríchXuấtThôngTin #HệThốngThốngNhất #SchemaBased #InformationExtraction #AI #TríTuệNhânTạo

https://www.reddit.com/r/LocalLLaMA/comments/1p69bea/gliner2_unified_schemabased_information_extraction/

NFDI x Computer Science Oct 29

🔍 New on the NFDIxCS blog:
How Large Language Models can support scientific literature research and publication workflows.
The post explores how LLM-based information extraction helps structure and reuse research knowledge more effectively — in line with FAIR principles.
👉 Read more: https://nfdixcs.org/meldung/llm-based-information-extraction-to-support-scientific-literature-research-and-publication-workflows

#AI #LLM #OpenScience #FAIRdata #Research #InformationExtraction #ScientificPublishing #NFDIxCS

LLM-Based Information Extraction to Support Scientific Literature Research and Publication Workflows

How can large language models make scientific research more FAIR and discoverable? Our latest work within NFDIxCS explores how AI can automatically extract key concepts from research papers—linking publications, data, and software more effectively to support transparent and efficient scientific workflows.

Harald Sack Oct 14

Today, I'm at Bundesarchiv in Koblenz for the Strategy & Planning meeting of our project "Wiedergutmachung". Our task in this project is to develop efficient information extraction from historical case files of the German Postwar recompensation process of nationalsocialist injustice.

@fiz_karlsruhe @fizise @LandesarchivBW #bundesarchiv @bmf #knowledgegraphs #llms #AI #informationextraction #archives #project @ddbkultur @archivportal @MahsaVafaie @fschwic

Harald Sack Oct 10

Open PhD/Junior Researcher Position in Neurosymbolic AI and Information Extraction on historical documents at FIZ Karlsruhe - Knowledge-driven AI research group (former ISE research group), starting at Jan 1, 2026.
Application Deadline: Oct 31, 2025
https://www.fiz-karlsruhe.de/en/stellenanzeigen/phdjunior-researcher-wmx-0

#jobadvertisement #phd #AI #neurosymbolicAI #informationextraction #machinelearning #knmowledgegraphs #ontologies @fiz_karlsruhe @fizise #dh #culturalheritage @nfdi4culture @MahsaVafaie @tabea @sourisnumerique @enorouzi

Show thread

Christian Boulanger Jun 11, 2025

The presentation "Extracting Citation Data Using LLMs" by @anwagnerdreas , David Carreto Fidalgo & me talks about how to extract structured reference information from footnote-heavy scholarship using LLMs:
https://www.youtube.com/watch?v=bgpsRrSeyIw&list=PL5rAX6ywmP7O_nT99Osd74uino78BJMVT

#LLM #InformationExtraction #Footnotes

Extracting Citation Data Using LLMs | C. Boulanger, D. Carreto Fidalgo & A. Wagner

YouTube

isws Jun 9, 2025

Was Michelangelo really a turtle?
This question will be solved by the research task force of @lysander07 who is presenting this LLM question answering and Information Extraction task, in which the reliability of LLMs applied on historical documents will be analysed.

#isws2025 #dh #digitalhumanities #semanticweb #semweb #knowledgegraphs #llms #informationextraction #renaissance #michelangelo #TMNT