#agenix decrypts the .age file → feeds the #secretbox key to kube-apiserver → apiserver uses it for etcd. The failure happened at the agenix layer (wrong key in the .age file), not in secretbox itself.
RBAC defeats: A compromised pod, a stolen kubeconfig, a rogue user — anyone who tries to read secrets through the Kubernetes API without sufficient permissions. They hit the apiserver, RBAC says no, they get a 403.
secretbox defeats: Someone who bypasses the API entirely — steals the etcd data directory, takes an etcd snapshot from a backup, reads etcd directly over its client port without going through kube-apiserver. RBAC never runs in this scenario because the attacker never talked to kube-apiserver.
The critical insight: secretbox does nothing if the attacker has API access, and RBAC does nothing if the attacker has disk access. They cover completely non-overlapping attack surfaces.
problem hit here would have been identical with #SQLite — the encryption layer is in kube-apiserver, not in the storage backend. But the operational simplicity of SQLite would have made recovery easier since inspecting and backing up the database is much more straightforward than #etcd snapshot management.
#kubernetes
troubles of secret #provisioning with #agenix and age
with #etcd sceretbox
and why #sops matter

etcd operator 0.2 has been released!

https://etcd.io/blog/2026/announcing-etcd-operator-v0.2.0/

This now makes the operator useful for production, or at least staging, use-cases, and brings it up to the functionality of the old operator -- plus better handling of TLS.

Take a look!

#kubernetes #etcd #operators

Announcing etcd-operator v0.2.0

Introduction Today, we are excited to announce the release of etcd-operator v0.2.0! This release brings important new features and improvements that enhance security, reliability, and operability for managing etcd clusters. New Features Certificate Management Version 0.2.0 introduces built-in certificate management to secure all TLS communication: Between etcd members (inter-member communication) Between clients and etcd members TLS is only configured when explicitly enabled by the user. Once enabled, etcd-operator automatically provisions and manages certificates based on the selected provider.

etcd

⚠️ NEW: Kubernetes Swap & etcd Stability!

Prevent control plane hangs with proper swap configuration. etcd performance tuning & swapfile best practices for production K8s.

📖 Read: https://devopstales.github.io/kubernetes/k8s-swap-etcd-stability/?utm_source=twitter&utm_medium=social

#Kubernetes #etcd #Performance #SRE #K8s

Kubernetes Swap and etcd Stability: Preventing Control Plane Hangs

When enabling swap on Kubernetes nodes, you might encounter a critical issue where misbehaving containers don’t get killed automatically. When this affects etcd, the API server generating excessive load and consuming all available resources. This post explains the problem and provides two solutions.

DevOpsTales

Swap on K8s nodes? Containers hang instead of OOM-killing—etcd suffers, control plane cascades. 2 fixes: resource limits + etcd HAProxy LB. Protect your cluster! 👇

https://devopstales.github.io/kubernetes/k8s-swap-etcd-stability/

#Kubernetes #etcd #SRE #DevOps #CloudNative #Stability

Kubernetes Swap and etcd Stability: Preventing Control Plane Hangs

When enabling swap on Kubernetes nodes, you might encounter a critical issue where misbehaving containers don’t get killed automatically. When this affects etcd, the API server generating excessive load and consuming all available resources. This post explains the problem and provides two solutions.

DevOpsTales

What reason would there be to seperate etcd out of the Kubernetes manifests of the control plane nodes but keep it as a native service installed on the same machines running the control plane?

There's nothing to gain in terms of high availability there.

You still have X amount of control plane nodes that also run etcd as cluster nodes.

I'm trying to figure out what my predecessor thought while building this Kubernetes environment.

The two etcd topologies mentioned in official K8S docs are:

- integrated etcd (etcd as a Kubernetes Manifest, started as containers together with coredns, kube-apiserver and so on)

- seperated etcd nodes (X amount of machines that host etcd as a native service on the OS and the control plane is configured to use them.

#kubernetes #devop #k8s #etcd

In my (short) dad time this morning, I've tried to install mgmt [1] to run a distributed hello world on my main machine running on Ubuntu LTS. The built-in binaries depend on augeas which was easy to fix. But also libvirt which is surprisingly old on Ubuntu compared to Debian (latest). I tried to build it myself but I couldn't install nex (the lexer). I then built the binary using Docker thanks to the quick start guide.

I first started to run mgmt in standalone mode. It's nice to see etcd embedded in the binary (at least for testing). Then I tried to deploy multi mgmt nodes with a standalone etcd using docker-compose. I've lost a lot of time trying to override the command because I didn't remember the expected syntax.

I was trying to make etcd listen to all interfaces so mgmt could connect when my daughter showed up.

[1] https://github.com/purpleidea/mgmt (@purpleidea)

#mgmt #homelab #selfhosting #etcd #docker #libvirt #ubuntu #debian

Commits · purpleidea/mgmt

Next generation distributed, event-driven, parallel config management! - Commits · purpleidea/mgmt

GitHub

Эволюция сбора flow-статистики в Яндексе: архитектура, грабли и оптимизации

Привет, Хабр! На связи Саша Лопинцев, SRE в группе разработки сетевой инфраструктуры и мониторинга Yandex Infrastructure. Я очень люблю мониторинг — а когда дело касается видимости сетевого трафика, нам не обойтись без анализа flow‑данных. Сегодня расскажу, как и почему мы переехали с устаревшего flow‑коллектора на GoFlow2, реализовали запись в БД и через etcd решили проблемы с шаблонами. Новая система обрабатывает 85 тысяч пакетов статистики в секунду, обеспечивает отказоустойчивость и помогает создавать отчёты. Если вам интересно узнать чуть больше об архитектуре, экспериментах, ошибках и решениях, полезных для инфраструктурного мониторинга в продакшн‑среде, читайте далее.

https://habr.com/ru/companies/yandex/articles/1000520/

#flowметрики #goflow #goflow2 #etcd #ipfix #sflow #netflow

Эволюция сбора flow-статистики в Яндексе: архитектура, грабли и оптимизации

Привет, Хабр! На связи Саша Лопинцев, SRE в группе разработки сетевой инфраструктуры и мониторинга Yandex Infrastructure. Я очень люблю мониторинг — а когда дело касается видимости сетевого трафика,...

Хабр

etcd và Consul giới hạn kích thước giá trị để tránh tắc nghẽn, nhưng dữ liệu hiện đại (vector AI, JSON khổng lồ) thường vượt quá. UnisonDB đề xuất WAL dạng đồ thị liên kết ngược (linked‑list) cho phép ghi lớn mà không làm giảm hiệu năng replication, heartbeat và bầu cử leader. #KV #Database #UnisonDB #etcd #Consul #AI #CôngNghệ

https://www.reddit.com/r/programming/comments/1qkz5d0/breaking_keyvalue_size_limits_linked_list_wals/

Finally completed the upgrade of all of the five @midgaard #Kubernetes nodes to what I colloquially refer to as midgaard-v3.

Same standard #Hetzner nodes with a couple of #NVMe sticks and around 10TB of spinning metal for bulk storage. Much simplified partition layout. NVMes used for caching. Scheduled backups of #etcd. Continuous SMART disk testing and reporting.

The four other nodes have been running smoothly for about a month so I don't expect any surprises at this point. Distro is #Debian Trixie. Kubernetes is version 1.33.

All is well.

So far.