@iamtherockstar I helped move Roblox to a containerized architecture, I wrote the training program for helping employees spin up new services and deploy them.
Roblox is actually big enough to justify k8s, but we didn't roll it out. Mostly based on the SRE pushback and capacity. But the devs really wanted it.
When I drilled into what the devs really wanted, it was less specifically about k8s and more about the notion that they wanted specific features that were supported by the k8s tooling. Not just orchestration, but things like Istio as well.
Istio and Service Mesh would have made devs lives far easier. And the Platform team as well. But it was also "free" to them because managing it becomes an "ops problem".
In our case it was less about architecture and more about moving faster by offloading work.
Ultimately, as you noted, that cost is real. And most orgs really can't really staff that. So they just end up handing it off to a cloud.
@gatesvp Also, I can see Roblox actually having a scale issue where k8s is helpful, e.g. on demand autoscaling, resource management, etc. I suspect that your architecture ended up being more purpose-built for your application(s), which probably meant you had spent more time in design than the folks who just take k8s off the shelf and run with it.
If a business evaluates things seriously and goes with k8s, cool. It's the k8s-as-default-infrastructure that is an issue.
@iamtherockstar Oh, we absolutely had the scale to justify such a thing. But we notably made it that far without actually having such a thing.
The nature of my research on that front made it pretty clear that k8s was powerful, but had a high bar for entry. And this research was literally part of my job.
This was back in 2020-2022 range, and I concluded back then that the vast majority of applications were best served using some form of hosted platform. Like Heroku or Azure Serverless, AWS Serverless etc. There were already so many tools that could run your services for you while offering really high availability databases/queues/caches.
If I were at a start-up tomorrow, I would be leveraging those as much as possible. Even hosted k8s would be way down on my priority list. There are so many steps on that staircase before K8s.