Kubernetes Autoscaling - HPA, Cluster Autoscaler

In partnership with

TechOps Examples

Hey — It's Govardhana MK 👋

Welcome to another technical edition.

Every Tuesday – You’ll receive a free edition with a byte-size use case, remote job opportunities, top news, tools, and articles.

Every Thursday and Saturday – You’ll receive a special edition with a deep dive use case, remote job opportunities and articles.

Kubernetes itself still confuses half the world.

Leaders are pushing AI Agents on Kubernetes. Seriously?

Many leaderships only got ambitions but not the hands on opportunities for their teams to learn.

The workshop is delivering:

  • Kagent fundamentals and architecture

  • Full deployment workflow from install to production

  • ModelConfig setup for AI models

  • MCP server integration for external connections

  • CLI mastery for faster workflows

  • Event driven execution with khook

IN TODAY'S EDITION

🧠 Use Case
  • Kubernetes Autoscaling - HPA, Cluster Autoscaler

👀 Remote Jobs

📚️ Resources

TOGETHER WITH MINDSTREAM

Turn AI Into Your Income Stream

The AI economy is booming, and smart entrepreneurs are already profiting. Subscribe to Mindstream and get instant access to 200+ proven strategies to monetize AI tools like ChatGPT, Midjourney, and more. From content creation to automation services, discover actionable ways to build your AI-powered income. No coding required, just practical strategies that work.

If you’re not a subscriber, here’s what you missed last week.

To receive all the full articles and support TechOps Examples, consider subscribing:

One-time 25% OFF on all annual plans of memberships. Closes Soon.

🧠 USE CASE

Kubernetes Autoscaling - HPA, Cluster Autoscaler

Applications rarely receive a constant workload. One hour they handle thousands of requests, the next hour only a few. Kubernetes solves this by autoscaling, automatically adjusting compute resources based on demand.

Autoscaling in Kubernetes happens at multiple levels:

  • Pod level scaling: Horizontal Pod Autoscaler (HPA) and Vertical Pod Autoscaler (VPA)

  • Node level scaling: Cluster Autoscaler (CA)

Today we focus on the two most commonly used in production, HPA and Cluster Autoscaler, and how they work together.

Horizontal Pod Autoscaler (HPA)

HPA adjusts the number of Pod replicas in a deployment based on real time load metrics.

  1. The Metrics Server collects resource usage from running Pods, such as CPU, memory, or custom metrics.

  2. HPA continuously checks these values against defined thresholds.

  3. When the average metric exceeds the target, HPA increases the replica count.

  4. The Deployment controller creates new Pods to match the desired state.

  5. When load decreases, HPA scales down by reducing replicas.

HPA ensures the application scales horizontally, creating more Pods for more work and fewer Pods when idle. It only manages Pods, not cluster capacity. If no space is left to schedule new Pods, they enter a Pending state.

Cluster Autoscaler (CA)

Cluster Autoscaler monitors the scheduling status of Pods. When Pods remain Pending due to insufficient cluster resources, CA requests new nodes from the infrastructure provider.

  1. A Pod cannot be scheduled because there is not enough CPU or memory on any node.

  2. CA detects this condition and increases the node count in the node group, such as AWS Auto Scaling Group, Azure VMSS, or GCP MIG.

  3. The cloud provider provisions a new node.

  4. Kubernetes schedules the Pending Pods on the new node.

  5. When nodes stay underused for a defined period, CA can safely remove them.

Cluster Autoscaler ensures there is always enough capacity for Pods requested by workload controllers. Together, they keep workloads responsive and infrastructure cost efficient.

Production Notes

  • Metrics Server is mandatory for HPA. Without it, no scaling decisions are made.

  • HPA responds faster than CA. Node provisioning can take a few minutes depending on the cloud provider.

  • Configure resource requests and limits correctly so HPA decisions reflect actual workload pressure.

🔴 Get my DevOps & Kubernetes ebooks! (free for Premium Club and Personal Tier newsletter subscribers)

Looking to promote your company, product, service, or event to 56,000+ DevOps and Cloud Professionals? Let's work together.