📊 How Superhuman and Databricks built a 200K QPS inference platform together
From analytics partners to real-time inference partnersSuperhuman, the productivity...
📰 Source: Databricks
🔗 Link: https://www.databricks.com/blog/how-superhuman-and-databricks-built-200k-qps-inference-platform-together

How Superhuman and Databricks built a 200K QPS inference platform together
Superhuman began serving its grammar correction model via Databricks’ Foundation Model API, handling>200K QPS with p99 latency under 1s. Through a close engineering partnership with Databricks, both teams optimized runtime performance to deliver a 60% throughput gain, while maintaining 4 9’s of availability