AI at the edge is an infrastructure puzzle. Red Hat is helping solve it by contributing llm-d to the #CNCF, establishing "well-lit paths" for AI-RAN orchestration with SoftBank. 🐧

This is about optimization—making inference a first-class citizen alongside traditional containers.

Proud to see Red Hat continuing our legacy of open-source leadership, from #Kubernetes and #etcd to #KEDA and now #llmd.

Read more: https://www.redhat.com/en/blog/how-llm-d-brings-critical-resource-optimization-softbanks-ai-ran-orchestrator

#RedHat #AI #OpenSource #KubeCon #CloudNative

How llm-d brings critical resource optimization with SoftBank’s AI-RAN orchestrator

In Red Hat’s latest collaboration with SoftBank Corp., we have integrated llm-d into SoftBank’s AI-RAN orchestrator, AITRAS.

Red Hat is contributing llm-d to the #CNCF, turning fragmented AI into modular, interoperable microservices. 🐧

The goal? Make AI inference a first-class citizen in the same cloud-native environment as your traditional apps.

I love how Red Hat continues to fuel the #OpenSource ecosystem. From our roots in #Kubernetes and #etcd to newer projects like #KEDA and #CRI-O, we’re committed to building "well-lit paths" for everyone.

#RedHat #KubeCon #CloudNativeCon #AI #llmd

https://www.redhat.com/en/blog/why-were-contributing-llm-d-cncf-standardizing-future-ai?sc_cid=701f2000000txokAAA&utm_source=bambu&utm_medium=organic_social

Why we’re contributing llm-d to the CNCF: Standardizing the future of AI

Red Hat is contributing llm-d to the Cloud Native Computing Foundation (CNCF) as a Sandbox project to standardize high-performance, distributed AI inference serving within the cloud-native stack. This contribution aims to bridge the capabilities gap between AI experimentation and production by providing a specialized data-plane orchestration layer that maximizes infrastructure efficiency and enables flexible deployment on any choice of hardware.

#etcd is #k8s 's key-value store where all cluster state lives including secrets. By default secrets are only base64 encoded in etcd, not encrypted. If someone gets etcd access (backup file, snapshot, direct port access) they get all your secrets in plaintext. You can enable encryption at rest for etcd, but most people don't set it up and it's still inside your #cluster
#agenix decrypts the .age file → feeds the #secretbox key to kube-apiserver → apiserver uses it for etcd. The failure happened at the agenix layer (wrong key in the .age file), not in secretbox itself.
RBAC defeats: A compromised pod, a stolen kubeconfig, a rogue user — anyone who tries to read secrets through the Kubernetes API without sufficient permissions. They hit the apiserver, RBAC says no, they get a 403.
secretbox defeats: Someone who bypasses the API entirely — steals the etcd data directory, takes an etcd snapshot from a backup, reads etcd directly over its client port without going through kube-apiserver. RBAC never runs in this scenario because the attacker never talked to kube-apiserver.
The critical insight: secretbox does nothing if the attacker has API access, and RBAC does nothing if the attacker has disk access. They cover completely non-overlapping attack surfaces.
problem hit here would have been identical with #SQLite — the encryption layer is in kube-apiserver, not in the storage backend. But the operational simplicity of SQLite would have made recovery easier since inspecting and backing up the database is much more straightforward than #etcd snapshot management.
#kubernetes
troubles of secret #provisioning with #agenix and age
with #etcd sceretbox
and why #sops matter

etcd operator 0.2 has been released!

https://etcd.io/blog/2026/announcing-etcd-operator-v0.2.0/

This now makes the operator useful for production, or at least staging, use-cases, and brings it up to the functionality of the old operator -- plus better handling of TLS.

Take a look!

#kubernetes #etcd #operators

Announcing etcd-operator v0.2.0

Introduction Today, we are excited to announce the release of etcd-operator v0.2.0! This release brings important new features and improvements that enhance security, reliability, and operability for managing etcd clusters. New Features Certificate Management Version 0.2.0 introduces built-in certificate management to secure all TLS communication: Between etcd members (inter-member communication) Between clients and etcd members TLS is only configured when explicitly enabled by the user. Once enabled, etcd-operator automatically provisions and manages certificates based on the selected provider.

etcd

⚠️ NEW: Kubernetes Swap & etcd Stability!

Prevent control plane hangs with proper swap configuration. etcd performance tuning & swapfile best practices for production K8s.

📖 Read: https://devopstales.github.io/kubernetes/k8s-swap-etcd-stability/?utm_source=twitter&utm_medium=social

#Kubernetes #etcd #Performance #SRE #K8s

Kubernetes Swap and etcd Stability: Preventing Control Plane Hangs

When enabling swap on Kubernetes nodes, you might encounter a critical issue where misbehaving containers don’t get killed automatically. When this affects etcd, the API server generating excessive load and consuming all available resources. This post explains the problem and provides two solutions.

DevOpsTales

Swap on K8s nodes? Containers hang instead of OOM-killing—etcd suffers, control plane cascades. 2 fixes: resource limits + etcd HAProxy LB. Protect your cluster! 👇

https://devopstales.github.io/kubernetes/k8s-swap-etcd-stability/

#Kubernetes #etcd #SRE #DevOps #CloudNative #Stability

Kubernetes Swap and etcd Stability: Preventing Control Plane Hangs

When enabling swap on Kubernetes nodes, you might encounter a critical issue where misbehaving containers don’t get killed automatically. When this affects etcd, the API server generating excessive load and consuming all available resources. This post explains the problem and provides two solutions.

DevOpsTales

What reason would there be to seperate etcd out of the Kubernetes manifests of the control plane nodes but keep it as a native service installed on the same machines running the control plane?

There's nothing to gain in terms of high availability there.

You still have X amount of control plane nodes that also run etcd as cluster nodes.

I'm trying to figure out what my predecessor thought while building this Kubernetes environment.

The two etcd topologies mentioned in official K8S docs are:

- integrated etcd (etcd as a Kubernetes Manifest, started as containers together with coredns, kube-apiserver and so on)

- seperated etcd nodes (X amount of machines that host etcd as a native service on the OS and the control plane is configured to use them.

#kubernetes #devop #k8s #etcd

In my (short) dad time this morning, I've tried to install mgmt [1] to run a distributed hello world on my main machine running on Ubuntu LTS. The built-in binaries depend on augeas which was easy to fix. But also libvirt which is surprisingly old on Ubuntu compared to Debian (latest). I tried to build it myself but I couldn't install nex (the lexer). I then built the binary using Docker thanks to the quick start guide.

I first started to run mgmt in standalone mode. It's nice to see etcd embedded in the binary (at least for testing). Then I tried to deploy multi mgmt nodes with a standalone etcd using docker-compose. I've lost a lot of time trying to override the command because I didn't remember the expected syntax.

I was trying to make etcd listen to all interfaces so mgmt could connect when my daughter showed up.

[1] https://github.com/purpleidea/mgmt (@purpleidea)

#mgmt #homelab #selfhosting #etcd #docker #libvirt #ubuntu #debian

Commits · purpleidea/mgmt

Next generation distributed, event-driven, parallel config management! - Commits · purpleidea/mgmt

GitHub