- TechOps Examples
- Posts
- Kubernetes Autoscaling - HPA, Cluster Autoscaler
Kubernetes Autoscaling - HPA, Cluster Autoscaler
TechOps Examples
Hey — It's Govardhana MK 👋
Welcome to another technical edition.
Every Tuesday – You’ll receive a free edition with a byte-size use case, remote job opportunities, top news, tools, and articles.
Every Thursday and Saturday – You’ll receive a special edition with a deep dive use case, remote job opportunities and articles.
Kubernetes itself still confuses half the world.
Leaders are pushing AI Agents on Kubernetes. Seriously?
Many leaderships only got ambitions but not the hands on opportunities for their teams to learn.
The workshop is delivering:
Kagent fundamentals and architecture
Full deployment workflow from install to production
ModelConfig setup for AI models
MCP server integration for external connections
CLI mastery for faster workflows
Event driven execution with khook
IN TODAY'S EDITION
🧠 Use Case
Kubernetes Autoscaling - HPA, Cluster Autoscaler
👀 Remote Jobs
Buzz Solutions is hiring a MLOps Engineer
Remote Location: Worldwide
Canonical is hiring a Cloud Engineering Manager
Remote Location: Worldwide
📚️ Resources
TOGETHER WITH MINDSTREAM
Turn AI Into Your Income Stream
The AI economy is booming, and smart entrepreneurs are already profiting. Subscribe to Mindstream and get instant access to 200+ proven strategies to monetize AI tools like ChatGPT, Midjourney, and more. From content creation to automation services, discover actionable ways to build your AI-powered income. No coding required, just practical strategies that work.
If you’re not a subscriber, here’s what you missed last week.
To receive all the full articles and support TechOps Examples, consider subscribing:
One-time 25% OFF on all annual plans of memberships. Closes Soon.
🧠 USE CASE
Kubernetes Autoscaling - HPA, Cluster Autoscaler
Applications rarely receive a constant workload. One hour they handle thousands of requests, the next hour only a few. Kubernetes solves this by autoscaling, automatically adjusting compute resources based on demand.
Autoscaling in Kubernetes happens at multiple levels:
Pod level scaling: Horizontal Pod Autoscaler (HPA) and Vertical Pod Autoscaler (VPA)
Node level scaling: Cluster Autoscaler (CA)
Today we focus on the two most commonly used in production, HPA and Cluster Autoscaler, and how they work together.
Horizontal Pod Autoscaler (HPA)
HPA adjusts the number of Pod replicas in a deployment based on real time load metrics.

The Metrics Server collects resource usage from running Pods, such as CPU, memory, or custom metrics.
HPA continuously checks these values against defined thresholds.
When the average metric exceeds the target, HPA increases the replica count.
The Deployment controller creates new Pods to match the desired state.
When load decreases, HPA scales down by reducing replicas.
HPA ensures the application scales horizontally, creating more Pods for more work and fewer Pods when idle. It only manages Pods, not cluster capacity. If no space is left to schedule new Pods, they enter a Pending state.
Cluster Autoscaler (CA)
Cluster Autoscaler monitors the scheduling status of Pods. When Pods remain Pending due to insufficient cluster resources, CA requests new nodes from the infrastructure provider.

A Pod cannot be scheduled because there is not enough CPU or memory on any node.
CA detects this condition and increases the node count in the node group, such as AWS Auto Scaling Group, Azure VMSS, or GCP MIG.
The cloud provider provisions a new node.
Kubernetes schedules the Pending Pods on the new node.
When nodes stay underused for a defined period, CA can safely remove them.
Cluster Autoscaler ensures there is always enough capacity for Pods requested by workload controllers. Together, they keep workloads responsive and infrastructure cost efficient.

Production Notes
Metrics Server is mandatory for HPA. Without it, no scaling decisions are made.
HPA responds faster than CA. Node provisioning can take a few minutes depending on the cloud provider.
Configure resource requests and limits correctly so HPA decisions reflect actual workload pressure.
Many Kubernetes Engineers don’t fully understand Kubernetes autoscaling and how HPA vs VPA vs KEDA work.
Here, I’ve made this to help you better understand.
55K+ read my DevOps and Cloud newsletter: techopsexamples.com/subscribe
What do we cover:
DevOps, Cloud, Kubernetes, IaC,— Govardhana Miriyala Kannaiah (@govardhana_mk)
3:23 AM • Oct 20, 2025
🔴 Get my DevOps & Kubernetes ebooks! (free for Premium Club and Personal Tier newsletter subscribers)


