Planning to attend #ACL2026? Consider joining our virtual workshop #CHum2026 on computational humour! We have a great lineup of 8 paper presentations plus invited talks by Gil Greengross, Brad Templeton, Joe Toplyn, and Laura E. Little. https://chumweb.org/

My master student Lukรกลก Eigler just defended his thesis (co-supervised with David Hurych from Valeo.ai) ๐ŸŽ‰ Congrats!

#NLP metric validation needs ๐ŸŒ๐Ÿ’ฐ human judgment data. Our fix: generate synthetic data for metric validation instead. โœ… Tested on MT, QA, summarization.

To appear at #ACL2026 Student Research Workshop:
https://arxiv.org/abs/2603.09403

#NLProc #MachineLearning

LLM as a Meta-Judge: Synthetic Data for NLP Evaluation Metric Validation

Validating evaluation metrics for NLG typically relies on expensive and time-consuming human annotations, which predominantly exist only for English datasets. We propose LLM as a Meta-Judge, a scalable framework that utilizes LLMs to generate synthetic evaluation datasets via controlled semantic degradation of real data, replacing human judgment. We validate our approach using meta-correlation, measuring the alignment between metric rankings derived from synthetic data and those from standard human benchmarks. Experiments across Machine Translation, Question Answering, and Summarization demonstrate that synthetic validation serves as a reliable proxy for human judgment, achieving meta-correlations exceeding 0.9 in multilingual QA and proves to be a viable alternative where human judgments are unavailable or too expensive to obtain. Our code and data are publicly available at https://github.com/eiglerl/meta-judge.

arXiv.org

Proud to share that the RCLN team of the LIPN laboratory will be presenting two papers at premier IR and NLP conferences this July! ๐Ÿ†

1๏ธโƒฃ SIGIR 2026 : Voronoi Token Pruning in Late-Interaction Models

By Yash Kankanampati, Yuxuan Zong, Nadi Tomeh, Benjamin Piwowarski, and Joseph Le Roux.

๐Ÿ”— https://arxiv.org/abs/2603.09933

2๏ธโƒฃ ACL 2026 : Emergent Mention Detection in LLMs โ€” Accepted at #ACL2026

By Victor Morand, Nadi Tomeh, Josiane Mothe, and Benjamin Piwowarski.

๐Ÿ”— https://arxiv.org/abs/2510.19410

We look forward to an exciting month of July sharing these advancements!

A Voronoi Cell Formulation for Principled Token Pruning in Late-Interaction Retrieval Models

Late-interaction models such as ColBERT offer competitive performance across various retrieval tasks but require storing a dense embedding for each document token, leading to a substantial index storage overhead. Past works address this by attempting to prune low-importance token embeddings based on statistical and empirical measures, but they often either lack formal grounding or are ineffective. To address these shortcomings, we introduce a framework grounded in hyperspace geometry and cast token pruning as a Voronoi cell estimation problem in the embedding space. By interpreting each token's influence as a measure of its Voronoi region, our approach enables principled pruning that retains retrieval quality while reducing index size. Through our experiments, we demonstrate that this approach serves not only as a competitive pruning strategy but also as a valuable tool for improving and interpreting token-level behavior within dense retrieval systems.

arXiv.org
Well done. The #ACL removed ("desk rejected") over 100 accepted papers from the #ACL2026 program and proceedings that contained citations to non-existing literature:
"The inclusion of these non-existent references is a clear violation of the ACL Policy on Publication Ethics."
Some authors may have already made travel plans. Statement at: https://2026.aclweb.org/acl_statement/
ACL Statement on Desk Rejecting Papers with Hallucinated References

Official website for the 64th Annual Meeting of the Association for Computational Linguistics

ACL 2026
๐Ÿ”Ž๐Ÿค– How does generative AI change web search?
At #ACL2026 ๐Ÿ‡บ๐Ÿ‡ธ, AISOC presents research on:
๐Ÿ“„ generative vs. traditional search
๐ŸŒ source selection & diversity
โš–๏ธ transparency & trust
๐Ÿ’ก Key point: AI doesnโ€™t just retrieve information โ€“ it reshapes what users see.
https://rc-trust.ai/news/news-detail/how-generative-ai-is-changing-web-search
#TrustworthyAI #NLP #MachineLearning #AIResearch #DigitalSociety
What can AI learn about companies beyond text? ๐Ÿค”
At #ACL2026, Ivan Habernal & team present MONETA
โ†’ combining language, satellite data ๐Ÿ›ฐ๏ธ & maps ๐Ÿ—บ๏ธ
A step toward more trustworthy AI ๐ŸŒ
https://rc-trust.ai/news/news-detail/verifying-data-about-companies-goes-multi-modal
Photo credit: ACL 2026 conference website โ€“ https://2026.aclweb.org/

What a week! ๐ŸŒŸ

Our group is celebrating a "triple win" with three papers accepted at #ACL2026, #CVPR2026 Workshop, and #FG2026 !

๐—”๐—–๐—Ÿ
๐˜—๐˜ณ๐˜ฐ๐˜›๐˜ฐ๐˜”: ๐˜—๐˜ณ๐˜ฐ๐˜ฎ๐˜ฐ๐˜ต๐˜ช๐˜ฏ๐˜จ ๐˜—๐˜ณ๐˜ฐ๐˜ด๐˜ฐ๐˜ค๐˜ช๐˜ข๐˜ญ ๐˜‰๐˜ฆ๐˜ฉ๐˜ข๐˜ท๐˜ช๐˜ฐ๐˜ถ๐˜ณ ๐˜ท๐˜ช๐˜ข ๐˜›๐˜ฉ๐˜ฆ๐˜ฐ๐˜ณ๐˜บ ๐˜ฐ๐˜ง ๐˜”๐˜ช๐˜ฏ๐˜ฅ-๐˜๐˜ฏ๐˜ง๐˜ฐ๐˜ณ๐˜ฎ๐˜ฆ๐˜ฅ ๐˜๐˜ฆ๐˜ฆ๐˜ฅ๐˜ฃ๐˜ข๐˜ค๐˜ฌ

๐—–๐—ฉ๐—ฃ๐—ฅ ๐—ช๐—ผ๐—ฟ๐—ธ๐˜€๐—ต๐—ผ๐—ฝ
๐˜œ๐˜—-๐˜๐˜ข๐˜ค๐˜Œ: ๐˜œ๐˜ด๐˜ฆ๐˜ณ-๐˜ฑ๐˜ณ๐˜ฆ๐˜ฅ๐˜ช๐˜ค๐˜ต๐˜ข๐˜ฃ๐˜ญ๐˜ฆ ๐˜๐˜ช๐˜ฏ๐˜ฆ-๐˜จ๐˜ณ๐˜ข๐˜ช๐˜ฏ๐˜ฆ๐˜ฅ ๐˜๐˜ข๐˜ค๐˜ฆ ๐˜š๐˜ฉ๐˜ข๐˜ฑ๐˜ฆ ๐˜Œ๐˜ฅ๐˜ช๐˜ต๐˜ช๐˜ฏ๐˜จ

๐—™๐—š
๐˜‹๐˜ช๐˜ง๐˜ง๐˜Œ๐˜บ๐˜ฆ๐˜š๐˜บ๐˜ฏ: ๐˜œ๐˜ด๐˜ฆ๐˜ณ-๐˜ด๐˜ฑ๐˜ฆ๐˜ค๐˜ช๐˜ง๐˜ช๐˜ค ๐˜š๐˜ถ๐˜ฃ๐˜ต๐˜ญ๐˜ฆ ๐˜Œ๐˜บ๐˜ฆ ๐˜”๐˜ฐ๐˜ท๐˜ฆ๐˜ฎ๐˜ฆ๐˜ฏ๐˜ต ๐˜š๐˜บ๐˜ฏ๐˜ต๐˜ฉ๐˜ฆ๐˜ด๐˜ช๐˜ด ๐˜œ๐˜ด๐˜ช๐˜ฏ๐˜จ ๐˜‹๐˜ช๐˜ง๐˜ง๐˜ถ๐˜ด๐˜ช๐˜ฐ๐˜ฏ ๐˜”๐˜ฐ๐˜ฅ๐˜ฆ๐˜ญ๐˜ด

Congratulations to the team!
For preprints and updates, feel free to visit our website: https://www.collaborative-ai.org/

Collaborative Artificial Intelligence

Our group conducts fundamental research towards collaborative artificial intelligence (CAI) at the intersection of multimodal machine learning, computational cognitive modelling, computer vision, and human-machine interaction.

TrustHLT goes to #ACL2026 ! Arda Yรผksel as the lead author got our paper accepted to the main conference.

Check the cool viz and the full paper package: https://trusthlt.github.io/Moneta/

๐ŸŽ‰Thrilled to announce that the paper "PUMA: Projected Universal Multilingual ASR for Low-Resource Settings" has been accepted at the #ACL2026 Findings Conference!
Congratulations to Ilyes Oukid, Bilal Faye, Hanane Azzag, Mustapha Lebbah and Said Yacine Boulahia, all proud members of the Laboratoire d'Informatique de Paris-Nord (LIPN).
This work was carried out in collaboration with the DAVID Laboratory at University Versailles Saint-Quentin-en-Yvelines/ Universitรฉ Paris-Saclay.

See you in San Diego, USA!

Thanks to a bit of luck, I will be at #ACL2026 to present our article "Overcoming Copyright Barriers in Corpus Distribution Through Non-Reversible Hashing" (a joint work with @VincentLabatut, Xavier Bost and Hen-Hsen Huang). More details and preprint coming soon!