How to Back Up and Restore Amazon EKS Cluster Resources Using Velero

TechOps Examples

Hey — It's Govardhana MK 👋

Welcome to another technical edition.

Every Tuesday – You’ll receive a free edition with a byte-size use case, remote job opportunities, top news, tools, and articles.

Every Thursday and Saturday – You’ll receive a special edition with a deep dive use case, remote job opportunities, and articles.

[Free Playbook] 100+ Claude Code hacks to ship code 10X faster

Claude is a superpower if you know how to use it correctly.

Discover how THE CODE's Playbook to AI can elevate both your productivity and creativity to get more things done.

Learn to automate tasks, enhance decision-making, and foster innovation with the power of AI.

Claim Your Free Playbook →

👀 Remote Jobs

Metabase is hiring a Senior SRE/DevOps Engineer
Remote Location: Worldwide
Remote is hiring a Senior Backend Engineer
Remote Location: Worldwide

Powered by: Jobsurface.com

Browse 560 Worldwide Jobs Here →

📚 Resources

What is Multi-Cloud Security? Challenges & Best Practices

GitHub Actions Security Checklist for Supply Chain Attacks

New in Terraform 1.15: Dynamic sources, variable deprecation, and more

Looking to promote your company, product, service, or event to 48,000+ Cloud Native Professionals? Let's work together. Advertise With Us

🧠 DEEP DIVE USE CASE

How to Back Up and Restore Amazon EKS Cluster Resources Using Velero

Production Kubernetes clusters hold two categories of things that are hard to rebuild. The first is configuration: every Deployment, Service, ConfigMap, Secret, RBAC binding, and Ingress rule your team has written over months of iteration. The second is data: the persistent volumes attached to your stateful workloads, the databases, the file stores, the queues. When a namespace gets accidentally deleted, when a cluster upgrade goes wrong, or when you need to migrate workloads to a new cluster in a different region, you need both back. Velero gives you both.

What is Velero?

Velero is an open source backup and restore tool for Kubernetes clusters. It protects two things simultaneously: the Kubernetes object definitions stored in the API server (Deployments, Services, ConfigMaps, RBAC rules, PersistentVolumeClaims), and the actual data inside your persistent volumes (your database files, application uploads, log archives).

What makes Velero different from traditional backup tools is that it works entirely through the Kubernetes API. It does not need direct access to your storage systems. It does not need to SSH into nodes. It runs as a pod inside your cluster, watches for Backup and Restore custom resources, and uses the API server to discover and capture everything. The Kubernetes native approach means the same backup file is portable across different Kubernetes distributions. A backup from an EKS cluster can be restored to a GKE cluster or an on-premises cluster running the same Kubernetes version.

How Velero works

Every Velero operation, whether a backup, a restore, or a scheduled backup, is a Kubernetes Custom Resource stored in etcd. Velero ships with controllers that watch for these custom resources and act on them. This is the same pattern used by the Kubernetes Deployment controller or the ReplicaSet controller. The architecture is native Kubernetes, not a separate system bolted on.

The Backup workflow

When you run velero backup create backup-myprimary, five things happen in sequence. The Velero CLI makes an HTTP POST to the Kubernetes API server creating a Backup custom resource object. The BackupController, which is a reconciliation loop running inside the Velero deployment pod, notices the new Backup object and validates its spec. It then queries the API server for all resources matching the backup scope, in this case everything in the myprimary namespace. It serializes those resources into a JSON tarball and uploads it to the configured Amazon S3 bucket. Simultaneously, it calls the CSI snapshot API to create EBS snapshots of every PersistentVolume in scope. When both operations complete, the Backup object's phase is updated to Completed in etcd.

One important characteristic of Velero backups is that they are not strictly atomic. If a resource is being created or modified at the exact moment the backup controller queries the API server, it may not be captured. For stateful databases that write continuously, Velero supports backup hooks: pre backup hooks that can instruct the database to flush in-memory buffers to disk before the snapshot is taken.

S3 is the source of truth. Velero continuously reconciles the Backup CRDs in etcd against the objects it finds in the S3 bucket. If a Backup CRD exists in Kubernetes but the corresponding tarball is gone from S3, Velero deletes the CRD. If a backup tarball exists in S3 but no CRD exists (such as after migrating to a new cluster), Velero creates the CRD automatically. This reconciliation is what makes cross-cluster restore work without any manual intervention.

🔴 Get my DevOps & Kubernetes ebooks! (free for Premium Club and Personal Tier newsletter subscribers)

Upgrade to Paid to read the rest.

Become a paying subscriber to get access to this post and other subscriber-only content.

Upgrade

Paid subscriptions get you:

Access to archive of 250+ use cases
Deep Dive use case editions (Thursdays and Saturdays)
Access to Private Discord Community
Invitations to monthly Zoom calls for use case discussions and industry leaders meetups
Quarterly 1:1 'Ask Me Anything' power session

How to Back Up and Restore Amazon EKS Cluster Resources Using Velero

[Free Playbook] 100+ Claude Code hacks to ship code 10X faster

👀 Remote Jobs

📚 Resources

🧠 DEEP DIVE USE CASE

How to Back Up and Restore Amazon EKS Cluster Resources Using Velero

What is Velero?

How Velero works

Upgrade to Paid to read the rest.

Paid subscriptions get you:

Keep Reading

TechOps Examples

Home

Account

POLICIES

Request Sponsorship Details

SUPPORT

Upgrade