Meta's new LLM fails the only benchmark that matters