Kubernetes Autoscaling - HPA, Cluster Autoscaler

In partnership with

TechOps Examples

Hey — It's Govardhana MK 👋

Welcome to another technical edition.

Every Tuesday – You’ll receive a free edition with a byte-size use case, remote job opportunities, top news, tools, and articles.

Every Thursday and Saturday – You’ll receive a special edition with a deep dive use case, remote job opportunities and articles.

Kubernetes itself still confuses half the world.

Leaders are pushing AI Agents on Kubernetes. Seriously?

Many leaderships only got ambitions but not the hands on opportunities for their teams to learn.

Luckily Kagent (the CNCF project) Community created a hands on workshop.

The workshop is delivering:

Kagent fundamentals and architecture
Full deployment workflow from install to production
ModelConfig setup for AI models
MCP server integration for external connections
CLI mastery for faster workflows
Event driven execution with khook

Learn how to build production ready AI agents on Kubernetes

Join Free Here →

IN TODAY'S EDITION

🧠 Use Case

Kubernetes Autoscaling - HPA, Cluster Autoscaler

👀 Remote Jobs

Buzz Solutions is hiring a MLOps Engineer
Remote Location: Worldwide
Canonical is hiring a Cloud Engineering Manager
Remote Location: Worldwide

📚 Resources

The 14 hour AWS us-east-1 outage Explained

Manage Multi-Cluster Deployments with ArgoCD

Introducing Merkle Tree Certificates - Keeping the Internet fast and secure

TOGETHER WITH MINDSTREAM

Turn AI Into Your Income Stream

The AI economy is booming, and smart entrepreneurs are already profiting. Subscribe to Mindstream and get instant access to 200+ proven strategies to monetize AI tools like ChatGPT, Midjourney, and more. From content creation to automation services, discover actionable ways to build your AI-powered income. No coding required, just practical strategies that work.

Subscribe to Get Your Free Guide

If you’re not a subscriber, here’s what you missed last week.

How Rancher Simplifies Multi Cloud GitOps Kubernetes Management

Understanding Message Based Architectures in AWS

To receive all the full articles and support TechOps Examples, consider subscribing:

One-time 25% OFF on all annual plans of memberships. Closes Soon.

🧠 USE CASE

Kubernetes Autoscaling - HPA, Cluster Autoscaler

Applications rarely receive a constant workload. One hour they handle thousands of requests, the next hour only a few. Kubernetes solves this by autoscaling, automatically adjusting compute resources based on demand.

Autoscaling in Kubernetes happens at multiple levels:

Pod level scaling: Horizontal Pod Autoscaler (HPA) and Vertical Pod Autoscaler (VPA)
Node level scaling: Cluster Autoscaler (CA)

Today we focus on the two most commonly used in production, HPA and Cluster Autoscaler, and how they work together.

Horizontal Pod Autoscaler (HPA)

HPA adjusts the number of Pod replicas in a deployment based on real time load metrics.

The Metrics Server collects resource usage from running Pods, such as CPU, memory, or custom metrics.
HPA continuously checks these values against defined thresholds.
When the average metric exceeds the target, HPA increases the replica count.
The Deployment controller creates new Pods to match the desired state.
When load decreases, HPA scales down by reducing replicas.

HPA ensures the application scales horizontally, creating more Pods for more work and fewer Pods when idle. It only manages Pods, not cluster capacity. If no space is left to schedule new Pods, they enter a Pending state.

Cluster Autoscaler (CA)

Cluster Autoscaler monitors the scheduling status of Pods. When Pods remain Pending due to insufficient cluster resources, CA requests new nodes from the infrastructure provider.

A Pod cannot be scheduled because there is not enough CPU or memory on any node.
CA detects this condition and increases the node count in the node group, such as AWS Auto Scaling Group, Azure VMSS, or GCP MIG.
The cloud provider provisions a new node.
Kubernetes schedules the Pending Pods on the new node.
When nodes stay underused for a defined period, CA can safely remove them.

Cluster Autoscaler ensures there is always enough capacity for Pods requested by workload controllers. Together, they keep workloads responsive and infrastructure cost efficient.

Production Notes

Metrics Server is mandatory for HPA. Without it, no scaling decisions are made.
HPA responds faster than CA. Node provisioning can take a few minutes depending on the cloud provider.
Configure resource requests and limits correctly so HPA decisions reflect actual workload pressure.

— # (#)

🔴 Get my DevOps & Kubernetes ebooks! (free for Premium Club and Personal Tier newsletter subscribers)

Looking to promote your company, product, service, or event to 56,000+ DevOps and Cloud Professionals? Let's work together.

ADVERTISE WITH US →

Kubernetes Autoscaling - HPA, Cluster Autoscaler

IN TODAY'S EDITION

🧠 Use Case

👀 Remote Jobs

📚 Resources

Turn AI Into Your Income Stream

🧠 USE CASE

Kubernetes Autoscaling - HPA, Cluster Autoscaler

Horizontal Pod Autoscaler (HPA)

Cluster Autoscaler (CA)

Production Notes

Keep Reading

TechOps Examples

Home

Account

POLICIES

Request Sponsorship Details

SUPPORT

Upgrade