StepFun 3.5 Flash is #1 cost-effective model for OpenClaw tasks (300 battles)

https://app.uniclaw.ai/arena?tab=costEffectiveness&via=hn

OpenClaw Arena | UniClaw

A public benchmark for evaluating whether AI agents can complete real workflows. Compare model performance and cost-effectiveness on real agent tasks.

According to openrouter.ai it looks like StepFun 3.5 Flash is the most popular model at 3.5T tokens, vs GLM 5 Turbo at 2.5T tokens. Claude Sonnet is in 5th place with 1.05T tokens. Which isn't super suprising as StepFun is ~about 5% the price of Sonnet.

https://openrouter.ai/apps?url=https%3A%2F%2Fopenclaw.ai%2F

OpenClaw | OpenRouter

The AI that actually does things. OpenClaw uses OpenRouter to access hundreds of AI models.

> the most popular model

It was free for a long time. That usually skews the statistics. It was the same with grok-code-fast1.

Exactly. When I read the headline I thought: "Ofc it is, its free."
I should have clarified I didn't use the free version...