#ArtificialAnalysis published a #benchmark comparing the performance of #OpenAI’s #gptoss-120b across different #hostedproviders. The results showed #significantvariance. This highlights the challenges faced by customers of #openweightmodels, as #performance can vary depending on the #provider and their implementation. https://simonwillison.net/2025/Aug/15/inconsistent-performance/?eicker.news #tech #media #news
Open weight LLMs exhibit inconsistent performance across providers

Artificial Analysis published a new benchmark the other day, this time focusing on how an individual model—OpenAI’s gpt-oss-120b—performs across different hosted providers. The results showed some surprising differences. Here’s the …

Simon Willison’s Weblog