Mastodawn

Google’s new FACTS benchmark reveals a 70% factuality ceiling across four rigorous tests, from grounding to multimodal and search scenarios. Even Gemini 3 Pro struggles to break the barrier, highlighting limits for large‑language‑models on Kaggle‑style tasks. Dive into the data and see what this means for open‑source AI research. #FACTSbenchmark #Gemini3Pro #GroundingBenchmark #Factuality