Mastodawn

📊 How Superhuman and Databricks built a 200K QPS inference platform together

From analytics partners to real-time inference partnersSuperhuman, the productivity...

📰 Source: Databricks
🔗 Link: https://www.databricks.com/blog/how-superhuman-and-databricks-built-200k-qps-inference-platform-together

#DataScience

How Superhuman and Databricks built a 200K QPS inference platform together

Superhuman began serving its grammar correction model via Databricks’ Foundation Model API, handling>200K QPS with p99 latency under 1s. Through a close engineering partnership with Databricks, both teams optimized runtime performance to deliver a 60% throughput gain, while maintaining 4 9’s of availability

Databricks