AI at the edge is an infrastructure puzzle. Red Hat is helping solve it by contributing llm-d to the #CNCF, establishing "well-lit paths" for AI-RAN orchestration with SoftBank. 🐧
This is about optimization—making inference a first-class citizen alongside traditional containers.
Proud to see Red Hat continuing our legacy of open-source leadership, from #Kubernetes and #etcd to #KEDA and now #llmd.
Red Hat is contributing llm-d to the #CNCF, turning fragmented AI into modular, interoperable microservices. 🐧
The goal? Make AI inference a first-class citizen in the same cloud-native environment as your traditional apps.
I love how Red Hat continues to fuel the #OpenSource ecosystem. From our roots in #Kubernetes and #etcd to newer projects like #KEDA and #CRI-O, we’re committed to building "well-lit paths" for everyone.

Red Hat is contributing llm-d to the Cloud Native Computing Foundation (CNCF) as a Sandbox project to standardize high-performance, distributed AI inference serving within the cloud-native stack. This contribution aims to bridge the capabilities gap between AI experimentation and production by providing a specialized data-plane orchestration layer that maximizes infrastructure efficiency and enables flexible deployment on any choice of hardware.
Learn the critical failure points when running LLM inference on Kubernetes, including resource constraints, operator compatibility, security, scalability, and monitoring best practices for production workloads.
#Kubernetes #LLM Inference #Dynatrace #GPU Resource Allocation #Service Mesh #Network Policies #KEDA #Triton Inference Server #Redis #Prometheus
https://dasroot.net/posts/2026/02/running-llm-inference-on-kubernetes-what-breaks-first/

Learn the critical failure points when running LLM inference on Kubernetes, including resource constraints, operator compatibility, security, scalability, and monitoring best practices for production workloads.
Blog - I had a new bit of learning with Kubernetes Event Driven Autoscaling today
Changing Piefed Worker Scaling to be Based on Queue Size in Kubernetes with KEDA
I recently caused myself a bit of a minor issue by installing some updates on the Keyboard Vagabond cluster. It wasn’t a big deal, just s…
Blog - I had a new bit of learning with Kubernetes Event Driven Autoscaling today
Реальный кейс настройки Pod Autoscaling в k8s с точки зрения разработчика
На носу 2026 год, а я хочу поделиться своим путешествием по переводу приложения на инфраструктуру Kubernetes. И самая сложная и интересная часть, как раз, настройка автоскейлинга. Не слишком ли заезженная тема? Думаю нет, потому что я буду рассказывать именно с позиции разработчика приложения, а не девопса. Мне повезло, я без понятия как это всё настраивается. Я буду рассказывать как это всё работает. Конфигов кубера будет минимум, рассуждений и погружений в метрики максимум. В конце оставил TL;DR. Поехали?
https://habr.com/ru/articles/973936/
#kubernetes #hpa #horizontal_pod_autoscaler #keda #ec2 #cadvisor #k8s
From event-driven architectures to autoscaling, from #cloudnative #microservices to agentic AI, from corporate to #opensource and startups - the latest episode of OpenObservability Talks has it all!
I invited co-creator of #Dapr & #KEDA @yaronschneider to give us us the grand tour:
https://medium.com/p/eb2f4013d9a1
Автомасштабируем узлы кластера Kubernetes. Часть 2
Всем привет! Это вновь Илья Смирнов, архитектор решений из