etcd operator 0.2 has been released!
https://etcd.io/blog/2026/announcing-etcd-operator-v0.2.0/
This now makes the operator useful for production, or at least staging, use-cases, and brings it up to the functionality of the old operator -- plus better handling of TLS.
Take a look!
Introduction Today, we are excited to announce the release of etcd-operator v0.2.0! This release brings important new features and improvements that enhance security, reliability, and operability for managing etcd clusters. New Features Certificate Management Version 0.2.0 introduces built-in certificate management to secure all TLS communication: Between etcd members (inter-member communication) Between clients and etcd members TLS is only configured when explicitly enabled by the user. Once enabled, etcd-operator automatically provisions and manages certificates based on the selected provider.
⚠️ NEW: Kubernetes Swap & etcd Stability!
Prevent control plane hangs with proper swap configuration. etcd performance tuning & swapfile best practices for production K8s.

When enabling swap on Kubernetes nodes, you might encounter a critical issue where misbehaving containers don’t get killed automatically. When this affects etcd, the API server generating excessive load and consuming all available resources. This post explains the problem and provides two solutions.
Swap on K8s nodes? Containers hang instead of OOM-killing—etcd suffers, control plane cascades. 2 fixes: resource limits + etcd HAProxy LB. Protect your cluster! 👇
https://devopstales.github.io/kubernetes/k8s-swap-etcd-stability/

When enabling swap on Kubernetes nodes, you might encounter a critical issue where misbehaving containers don’t get killed automatically. When this affects etcd, the API server generating excessive load and consuming all available resources. This post explains the problem and provides two solutions.
What reason would there be to seperate etcd out of the Kubernetes manifests of the control plane nodes but keep it as a native service installed on the same machines running the control plane?
There's nothing to gain in terms of high availability there.
You still have X amount of control plane nodes that also run etcd as cluster nodes.
I'm trying to figure out what my predecessor thought while building this Kubernetes environment.
The two etcd topologies mentioned in official K8S docs are:
- integrated etcd (etcd as a Kubernetes Manifest, started as containers together with coredns, kube-apiserver and so on)
- seperated etcd nodes (X amount of machines that host etcd as a native service on the OS and the control plane is configured to use them.
In my (short) dad time this morning, I've tried to install mgmt [1] to run a distributed hello world on my main machine running on Ubuntu LTS. The built-in binaries depend on augeas which was easy to fix. But also libvirt which is surprisingly old on Ubuntu compared to Debian (latest). I tried to build it myself but I couldn't install nex (the lexer). I then built the binary using Docker thanks to the quick start guide.
I first started to run mgmt in standalone mode. It's nice to see etcd embedded in the binary (at least for testing). Then I tried to deploy multi mgmt nodes with a standalone etcd using docker-compose. I've lost a lot of time trying to override the command because I didn't remember the expected syntax.
I was trying to make etcd listen to all interfaces so mgmt could connect when my daughter showed up.
[1] https://github.com/purpleidea/mgmt (@purpleidea)
#mgmt #homelab #selfhosting #etcd #docker #libvirt #ubuntu #debian
Эволюция сбора flow-статистики в Яндексе: архитектура, грабли и оптимизации
Привет, Хабр! На связи Саша Лопинцев, SRE в группе разработки сетевой инфраструктуры и мониторинга Yandex Infrastructure. Я очень люблю мониторинг — а когда дело касается видимости сетевого трафика, нам не обойтись без анализа flow‑данных. Сегодня расскажу, как и почему мы переехали с устаревшего flow‑коллектора на GoFlow2, реализовали запись в БД и через etcd решили проблемы с шаблонами. Новая система обрабатывает 85 тысяч пакетов статистики в секунду, обеспечивает отказоустойчивость и помогает создавать отчёты. Если вам интересно узнать чуть больше об архитектуре, экспериментах, ошибках и решениях, полезных для инфраструктурного мониторинга в продакшн‑среде, читайте далее.
etcd và Consul giới hạn kích thước giá trị để tránh tắc nghẽn, nhưng dữ liệu hiện đại (vector AI, JSON khổng lồ) thường vượt quá. UnisonDB đề xuất WAL dạng đồ thị liên kết ngược (linked‑list) cho phép ghi lớn mà không làm giảm hiệu năng replication, heartbeat và bầu cử leader. #KV #Database #UnisonDB #etcd #Consul #AI #CôngNghệ
Finally completed the upgrade of all of the five @midgaard #Kubernetes nodes to what I colloquially refer to as midgaard-v3.
Same standard #Hetzner nodes with a couple of #NVMe sticks and around 10TB of spinning metal for bulk storage. Much simplified partition layout. NVMes used for caching. Scheduled backups of #etcd. Continuous SMART disk testing and reporting.
The four other nodes have been running smoothly for about a month so I don't expect any surprises at this point. Distro is #Debian Trixie. Kubernetes is version 1.33.
All is well.
So far.