Mastodawn

How We Broke Top AI Agent Benchmarks: And What Comes Next

If only the blog itself wasn't written by AI?

>No reasoning. No capability. Just exploitation of how the score is computed.

shudder

I wonder what college freshman-level writing classes are teaching about writing voice and AI. The tell-tale patterns are pretty frustrating to read.

Whatever classes these guys took, they skipped the one on scientific misconduct.