Discover how CRITICBENCH tests AI by sampling “convincing wrong answers” to reveal subtle flaws in model reasoning and accuracy.