My master student Lukรกลก Eigler just defended his thesis (co-supervised with David Hurych from Valeo.ai) ๐ Congrats!
#NLP metric validation needs ๐๐ฐ human judgment data. Our fix: generate synthetic data for metric validation instead. โ Tested on MT, QA, summarization.
To appear at #ACL2026 Student Research Workshop:
https://arxiv.org/abs/2603.09403

Validating evaluation metrics for NLG typically relies on expensive and time-consuming human annotations, which predominantly exist only for English datasets. We propose LLM as a Meta-Judge, a scalable framework that utilizes LLMs to generate synthetic evaluation datasets via controlled semantic degradation of real data, replacing human judgment. We validate our approach using meta-correlation, measuring the alignment between metric rankings derived from synthetic data and those from standard human benchmarks. Experiments across Machine Translation, Question Answering, and Summarization demonstrate that synthetic validation serves as a reliable proxy for human judgment, achieving meta-correlations exceeding 0.9 in multilingual QA and proves to be a viable alternative where human judgments are unavailable or too expensive to obtain. Our code and data are publicly available at https://github.com/eiglerl/meta-judge.
Proud to share that the RCLN team of the LIPN laboratory will be presenting two papers at premier IR and NLP conferences this July! ๐
1๏ธโฃ SIGIR 2026 : Voronoi Token Pruning in Late-Interaction Models
By Yash Kankanampati, Yuxuan Zong, Nadi Tomeh, Benjamin Piwowarski, and Joseph Le Roux.
๐ https://arxiv.org/abs/2603.09933
2๏ธโฃ ACL 2026 : Emergent Mention Detection in LLMs โ Accepted at #ACL2026
By Victor Morand, Nadi Tomeh, Josiane Mothe, and Benjamin Piwowarski.
๐ https://arxiv.org/abs/2510.19410
We look forward to an exciting month of July sharing these advancements!

Late-interaction models such as ColBERT offer competitive performance across various retrieval tasks but require storing a dense embedding for each document token, leading to a substantial index storage overhead. Past works address this by attempting to prune low-importance token embeddings based on statistical and empirical measures, but they often either lack formal grounding or are ineffective. To address these shortcomings, we introduce a framework grounded in hyperspace geometry and cast token pruning as a Voronoi cell estimation problem in the embedding space. By interpreting each token's influence as a measure of its Voronoi region, our approach enables principled pruning that retains retrieval quality while reducing index size. Through our experiments, we demonstrate that this approach serves not only as a competitive pruning strategy but also as a valuable tool for improving and interpreting token-level behavior within dense retrieval systems.
What a week! ๐
Our group is celebrating a "triple win" with three papers accepted at #ACL2026, #CVPR2026 Workshop, and #FG2026 !
๐๐๐
๐๐ณ๐ฐ๐๐ฐ๐: ๐๐ณ๐ฐ๐ฎ๐ฐ๐ต๐ช๐ฏ๐จ ๐๐ณ๐ฐ๐ด๐ฐ๐ค๐ช๐ข๐ญ ๐๐ฆ๐ฉ๐ข๐ท๐ช๐ฐ๐ถ๐ณ ๐ท๐ช๐ข ๐๐ฉ๐ฆ๐ฐ๐ณ๐บ ๐ฐ๐ง ๐๐ช๐ฏ๐ฅ-๐๐ฏ๐ง๐ฐ๐ณ๐ฎ๐ฆ๐ฅ ๐๐ฆ๐ฆ๐ฅ๐ฃ๐ข๐ค๐ฌ
๐๐ฉ๐ฃ๐ฅ ๐ช๐ผ๐ฟ๐ธ๐๐ต๐ผ๐ฝ
๐๐-๐๐ข๐ค๐: ๐๐ด๐ฆ๐ณ-๐ฑ๐ณ๐ฆ๐ฅ๐ช๐ค๐ต๐ข๐ฃ๐ญ๐ฆ ๐๐ช๐ฏ๐ฆ-๐จ๐ณ๐ข๐ช๐ฏ๐ฆ๐ฅ ๐๐ข๐ค๐ฆ ๐๐ฉ๐ข๐ฑ๐ฆ ๐๐ฅ๐ช๐ต๐ช๐ฏ๐จ
๐๐
๐๐ช๐ง๐ง๐๐บ๐ฆ๐๐บ๐ฏ: ๐๐ด๐ฆ๐ณ-๐ด๐ฑ๐ฆ๐ค๐ช๐ง๐ช๐ค ๐๐ถ๐ฃ๐ต๐ญ๐ฆ ๐๐บ๐ฆ ๐๐ฐ๐ท๐ฆ๐ฎ๐ฆ๐ฏ๐ต ๐๐บ๐ฏ๐ต๐ฉ๐ฆ๐ด๐ช๐ด ๐๐ด๐ช๐ฏ๐จ ๐๐ช๐ง๐ง๐ถ๐ด๐ช๐ฐ๐ฏ ๐๐ฐ๐ฅ๐ฆ๐ญ๐ด
Congratulations to the team!
For preprints and updates, feel free to visit our website: https://www.collaborative-ai.org/
TrustHLT goes to #ACL2026 ! Arda Yรผksel as the lead author got our paper accepted to the main conference.
Check the cool viz and the full paper package: https://trusthlt.github.io/Moneta/
๐Thrilled to announce that the paper "PUMA: Projected Universal Multilingual ASR for Low-Resource Settings" has been accepted at the #ACL2026 Findings Conference!
Congratulations to Ilyes Oukid, Bilal Faye, Hanane Azzag, Mustapha Lebbah and Said Yacine Boulahia, all proud members of the Laboratoire d'Informatique de Paris-Nord (LIPN).
This work was carried out in collaboration with the DAVID Laboratory at University Versailles Saint-Quentin-en-Yvelines/ Universitรฉ Paris-Saclay.
See you in San Diego, USA!