We had a Python service using FastAPI and OpenAI that was fast in development but slowed down a lot under load 🤔
We used load testing to find the bottleneck, then profilers and call graph visualization to understand the issue. My new blog post describes the debugging process, the tools and techniques we used, and how we ended up making the service twice as fast 🎉
Check it out if you're into Python performance and debugging.