TechOps Examples
Posts
Understanding Kubernetes etcd

Understanding Kubernetes etcd

Govardhana M K
September 02, 2025

TechOps Examples

Hey — It's Govardhana MK 👋

Welcome to another technical edition.

Every Tuesday – You’ll receive a free edition with a byte-size use case, remote job opportunities, top news, tools, and articles.

Every Thursday and Saturday – You’ll receive a special edition with a deep dive use case, remote job opportunities and articles.

👋 Before we begin, I’m happy to share a free practical e-book to sharpen your Kubernetes skills.

O’REILLY x CLOUD NATIVE DEVOPS WITH KUBERNETES

Whether you’re deploying your first container or scaling dozens of clusters, this book gives you the tools and clarity you need to succeed in the cloud-native era. Trusted by 25,000+ DevOps engineers to boost Kubernetes skills.

You’ll learn how to:

Understand the cloud-native landscape and why Kubernetes is at its center
Build containerized applications from scratch using Docker and Kubernetes
Set up and manage clusters in the cloud or on-prem
Design resilient infrastructure with scalability and automation in mind
Optimize workloads for cost, performance, and lifecycle management
Build secure, observable, and production-ready pipelines (CI/CD, secrets, disaster recovery)

Download your free copy now!

If you’re not a subscriber, here’s what you missed last week.

How to Create AWS High Availability Architecture

How to Implement Kubernetes Immortal Namespaces

To receive all the full articles and support TechOps Examples, consider subscribing:

One-time 25% OFF on all annual plans of memberships. Closes Soon.

IN TODAY'S EDITION

🧠 Use Case

Understanding Kubernetes etcd

👀 Remote Jobs

Lendable is hiring a Platform Engineer
Remote Location: Worldwide
Effectual is hiring a Senior Cloud Architect
Remote Location: Worldwide

📚️ Resources

Shift-Left Testing with Testcontainers

Advanced Kubernetes Pod Concepts Every DevOps Engineer Should Know

Designing a multi-tenant GKE platform for Yahoo Mail's migration journey

🛠️ TOOL OF THE DAY

chartdb - Database diagrams editor that allows you to visualize and design your DB with a single query.

🧠 USE CASE

Understanding Kubernetes etcd

In Kubernetes, etcd is a distributed, consistent key value store that holds the entire state of the cluster. Every component in Kubernetes relies on etcd through the API server to know what should be running and what is actually running.

What etcd Stores

Pod specifications, Deployments, Services, ConfigMaps, Secrets, and other Kubernetes resources.
Status information such as pod conditions, node health, and workloads running state.
Policies and cluster wide metadata including RBAC roles, quotas, and namespaces.

Role in Pod Lifecycle

When you create a Pod using kubectl, the API server validates the request and writes the Pod specification into etcd. The scheduler then assigns a node and updates etcd with this decision.

The kubelet on the chosen node reads the assigned Pod and reports its status back through the API server, which again updates etcd.

This loop of desired state and observed state is always reconciled against the data in etcd.

❝

Every change in the cluster is written to etcd through the API server. Controllers and kubelets do not talk to each other directly.

They interact only with the API server

High Availability and etcd

In production, etcd is run as a cluster to ensure fault tolerance. There are two common setups:

Stacked etcd cluster - etcd instances run on the same nodes as the Kubernetes control plane components. This setup is simple but offers less resilience in the event of node failures.

This is generally suitable for smaller environments or development clusters where ease of setup and management is prioritized over high availability.

External etcd cluster - etcd runs on dedicated nodes separate from the control plane, offering enhanced resilience and fault tolerance.

This setup enhances resilience and fault tolerance, as failures in the control plane do not directly impact etcd, and vice versa.

It provides a higher level of availability, making it the preferred choice for production environments where maintaining cluster stability is crucial.

Why etcd is Critical

If etcd is lost, the cluster loses its memory. Control plane components cannot function without it. This is why backups of etcd are essential for disaster recovery.

Kubernetes provides tools like etcdctl snapshot save to back up and restore etcd data.

Check out etcd-backup-restore github repo for the Collection of components to backup and restore the etcd of a Kubernetes cluster.

Key Takeaways

etcd is the single source of truth for Kubernetes.
Every cluster action goes through the API server and is persisted into etcd.
Running etcd in HA mode and backing it up regularly is mandatory for production clusters.

🔴 Get my DevOps & Kubernetes ebooks! (free for Premium Club and Personal Tier newsletter subscribers)

Looking to promote your company, product, service, or event to 52,000+ DevOps and Cloud Professionals? Let's work together.