MathArena.ai

MathArena: Evaluating LLMs on Uncontaminated Math Benchmarks

🤖📉 "AI #struggles to make it past the #math playground, aiming for the Olympiad podium but barely earning a participation ribbon. 🎖️ Attempting to turn equations into entertainment, MathArena's latest brainwave is evaluating bots on math tests most humans cringe at. Maybe next time, they'll try teaching #AI to count its own errors first. 😂"
https://matharena.ai/imo/ #MathOlympiad #MathArena #TechHumor #ParticipationRibbon #HackerNews #ngated
MathArena.ai

MathArena: Evaluating LLMs on Uncontaminated Math Benchmarks

Gemini 2.5 gets 24.4% on MathArena USAMO beating previous top score of 4.7%

https://matharena.ai/

#HackerNews #Gemini2.5 #MathArena #USAMO #Score #TechNews #AI

MathArena.ai

MathArena: Evaluating LLMs on Uncontaminated Math Benchmarks