Katja Hose

@katjahose
296 Followers
112 Following
118 Posts

Professor of Computer Science, TU Wien, Austria; Web & Data Science, in particular databases, managing & querying knowledge & graph data

#databases #knowledgegraphs #datascience #graphData

Homepagekatja-hose.de
Twittertwitter.com/HoseKatja

This morning, I had the pleasure to give a keynote at EGC 2026 on
"Reliable Knowledge in the Age of Generative AI: From Noisy Data to Trustworthy Agents"

The talk reflected on the role of knowledge graphs and data quality in the age of generative AI, and on why reliable, structured knowledge remains essential for hybrid AI systems. It also touched on hallucinations and on how these foundations relate to agentic systems.

Many thanks to Fatiha Saïs, the organizing team, and the audience.

🎬 DMKI Lab Highlights 2025

A snapshot of a year of research, collaborations, and community at the DMKI Lab (Data Management & Knowledge-Driven AI) at TU Wien.

Thanks to the team, collaborators, and partners — more to come in 2026!

#DMKILab #TUWien #KnowledgeGraphs #AI

2 #Postdoc positions at TU Wien (Vienna, Austria)

1️⃣ Data Warehousing & System Performance
https://jobs.tuwien.ac.at/Job/259680

2️⃣ Natural Language Interfaces, LLMs, Exploratory Data Access
https://jobs.tuwien.ac.at/Job/259683

Application Deadline: Nov 13, 2025
Details: https://dmki-tuwien.github.io/jobs.html

University Assistant Post-Doc (all genders)

🎓 PhD Position @ TU Wien (Vienna, Austria) — start Jan 2026

Join us to do research on knowledge graph embeddings.

Application deadline: November 6, 2025

🔗 Apply here: https://jobs.tuwien.ac.at/Job/259110

ℹ️ More details: https://dmki-tuwien.github.io/jobs.html

#PhD #AI #KnowledgeGraphs #MachineLearning #Vienna

University Assistant Prae-Doc (all genders)

Maxime Jakubowski presenting our work on "RDFGraphGen: An #RDF Graph Generator based on #SHACL Shapes" at #IJCKG25.

authors:
Milos Jovanovik, Marija Vecovska, Maxime Jakubowski, Katja Hose

github:
https://github.com/etnc/RDFGraphGen

@ymilosy
@katjahose

This morning I gave the keynote at IJCKG 2025 in Heraklion:
Sharp Edges in a Fuzzy World — Knowledge Graphs and LLMs

How can we combine the fuzzy nature of LLMs with the structured precision of knowledge graphs to build AI that we can rely on?

A wonderful discussion with an inspiring community - thank you to everyone who joined!

#IJCKG2025 #KnowledgeGraphs #LLMs

Towards Computer-Using Personal Agents

Piero A. Bonatti, John Domingue, Anna Lisa Gentile, Andreas Harth, Olaf Hartig, Aidan Hogan, Katja Hose, Ernesto Jimenez-Ruiz, Deborah L. McGuinness, Chang Sun, Ruben Verborgh, Jesse Wright
https://arxiv.org/abs/2503.15515 https://arxiv.org/pdf/2503.15515 https://arxiv.org/html/2503.15515

arXiv:2503.15515v1 Announce Type: new
Abstract: Computer-Using Agents (CUA) enable users to automate increasingly-complex tasks using graphical interfaces such as browsers. As many potential tasks require personal data, we propose Computer-Using Personal Agents (CUPAs) that have access to an external repository of the user's personal data. Compared with CUAs, CUPAs offer users better control of their personal data, the potential to automate more tasks involving personal data, better interoperability with external sources of data, and better capabilities to coordinate with other CUPAs in order to solve collaborative tasks involving the personal data of multiple users.

Towards Computer-Using Personal Agents

Computer-Using Agents (CUA) enable users to automate increasingly-complex tasks using graphical interfaces such as browsers. As many potential tasks require personal data, we propose Computer-Using Personal Agents (CUPAs) that have access to an external repository of the user's personal data. Compared with CUAs, CUPAs offer users better control of their personal data, the potential to automate more tasks involving personal data, better interoperability with external sources of data, and better capabilities to coordinate with other CUPAs in order to solve collaborative tasks involving the personal data of multiple users.

arXiv.org

MultiHal: Multilingual Dataset for Knowledge-Graph Grounded Evaluation of LLM Hallucinations

Ernests Lavrinovics, Russa Biswas, Katja Hose, Johannes Bjerva
https://arxiv.org/abs/2505.14101 https://arxiv.org/pdf/2505.14101 https://arxiv.org/html/2505.14101

arXiv:2505.14101v1 Announce Type: new
Abstract: Large Language Models (LLMs) have inherent limitations of faithfulness and factuality, commonly referred to as hallucinations. Several benchmarks have been developed that provide a test bed for factuality evaluation within the context of English-centric datasets, while relying on supplementary informative context like web links or text passages but ignoring the available structured factual resources. To this end, Knowledge Graphs (KGs) have been identified as a useful aid for hallucination mitigation, as they provide a structured way to represent the facts about entities and their relations with minimal linguistic overhead. We bridge the lack of KG paths and multilinguality for factual language modeling within the existing hallucination evaluation benchmarks and propose a KG-based multilingual, multihop benchmark called \textbf{MultiHal} framed for generative text evaluation. As part of our data collection pipeline, we mined 140k KG-paths from open-domain KGs, from which we pruned noisy KG-paths, curating a high-quality subset of 25.9k. Our baseline evaluation shows an absolute scale increase by approximately 0.12 to 0.36 points for the semantic similarity score in KG-RAG over vanilla QA across multiple languages and multiple models, demonstrating the potential of KG integration. We anticipate MultiHal will foster future research towards several graph-based hallucination mitigation and fact-checking tasks.

#toXiv_bot_toot

MultiHal: Multilingual Dataset for Knowledge-Graph Grounded Evaluation of LLM Hallucinations

Large Language Models (LLMs) have inherent limitations of faithfulness and factuality, commonly referred to as hallucinations. Several benchmarks have been developed that provide a test bed for factuality evaluation within the context of English-centric datasets, while relying on supplementary informative context like web links or text passages but ignoring the available structured factual resources. To this end, Knowledge Graphs (KGs) have been identified as a useful aid for hallucination mitigation, as they provide a structured way to represent the facts about entities and their relations with minimal linguistic overhead. We bridge the lack of KG paths and multilinguality for factual language modeling within the existing hallucination evaluation benchmarks and propose a KG-based multilingual, multihop benchmark called \textbf{MultiHal} framed for generative text evaluation. As part of our data collection pipeline, we mined 140k KG-paths from open-domain KGs, from which we pruned noisy KG-paths, curating a high-quality subset of 25.9k. Our baseline evaluation shows an absolute scale increase by approximately 0.12 to 0.36 points for the semantic similarity score in KG-RAG over vanilla QA across multiple languages and multiple models, demonstrating the potential of KG integration. We anticipate MultiHal will foster future research towards several graph-based hallucination mitigation and fact-checking tasks.

arXiv.org
Vol:18 No:6 → PlanRGCN: Predicting SPARQL Query Performance
👥 Authors: Abiram Mohanaraj, Matteo Lissandrini, Katja Hose
📄 PDF: https://www.vldb.org/pvldb/vol18/p1621-mohanaraj.pdf
Vol:18 No:8 → The Limits of Graph Samplers for Training Inductive Recommender Systems
👥 Authors: Theis Jendal, Matteo Lissandrini, Peter Dolog, Katja Hose
📄 PDF: https://www.vldb.org/pvldb/vol18/p2496-jendal.pdf