How Kubernetes Predictive AutoScaling Works

TechOps Examples

Hey — It's Govardhana MK 👋

Welcome to another technical edition.

Every Tuesday – You’ll receive a free edition with a byte-size use case, remote job opportunities, top news, tools, and articles.

Every Thursday and Saturday – You’ll receive a special edition with a deep dive use case, remote job opportunities and articles.

👋 👋 A big thank you to today's sponsor PERFECTSCALE

Cloud native apps need to scale up and down quickly.

But how do you do this with modern Java?

 You’ll learn how to:

→ Optimize Java apps for Docker and Kubernetes

→ Cut image size, startup time, and deployment lag

→ Fine-tune performance and resource allocation on K8s

IN TODAY'S EDITION

🧠 Use Case
  • How Kubernetes Predictive AutoScaling Works

👀 Remote Jobs

📚️ Resources

If you’re not a subscriber, here’s what you missed last week.

To receive all the full articles and support TechOps Examples, consider subscribing:

One-time 25% OFF on all annual plans of memberships. Closes Soon.

🧠 USE CASE

How Kubernetes Predictive AutoScaling Works

Traditional autoscaling reacts to what already happened; a surge in requests, CPU usage, or queue lengt; by scaling pods after the spike. This reactive model is simple but not ideal in situations when metric data that drive the scaling decision has strong seasonal patterns.

Kedify’s new predictive scaler adds foresight to the process. By applying time series forecasting models like Prophet, Kedify can anticipate future workload changes and adjust scaling before the load hits your infrastructure. The result: smoother performance, fewer cold starts, and smarter resource utilization.

From Reactive to Predictive Scaling

At the beginning there was HPA. HPA can react on CPU or Memory utilization. With traditional KEDA’s scalers, you can plug any metric and react faster then with classical HPA (HTTP requests, length of the message queue, etc.). That works well for most workloads, but some patterns repeat predictably (daily peaks, weekly cycles, monthly cron jobs). Instead of chasing them, Kedify can learn then.

Once enough historical samples are collected, Kedify trains a forecasting model to predict metric values in the near future. These predictions feed directly into autoscaling decisions.

Common forecasting methods considered include:

  • Moving average and exponential smoothing

  • ARIMA

  • Facebook Prophet

  • LSTM neural networks

  • Holt-Winters’ method

  • etc.

Among them, Prophet stands out for its robustness, interpretability, and ability to handle seasonality without heavy tuning.

How the Predictive Model Works

The predictive system continuously collects metric data through Kedify’s integration with KEDA. Once enough data accumulates, the model enters a periodic training and evaluation cycle to maintain accuracy as patterns evolve.

During retraining:

  1. The dataset is split into a train set (90%) and a test set (10%).

  2. The model is trained on the train set and evaluated on the test set using the Mean Absolute Percentage Error (MAPE) metric.

  3. The acceptable MAPE threshold can be configured in the kedify-predictive trigger via modelMapeThreshold.

  4. If the model’s error exceeds the threshold, Kedify automatically returns a default value (also defined in the trigger) instead of an unreliable prediction.

This ensures the scaler remains stable and trustworthy, even if patterns change or data becomes noisy. We may use scaling modifiers to ignore the scaler if it returns the default value. Do not expect impossible, not all the data exhibit seasonal patterns and general ML rule of thumb: “garbage in, garbage out” also applies here.

In the visualization below, the light-blue shadow represents the model’s uncertainty bounds. When data collection was interrupted for several hours, the Prophet model still maintained a plausible fit. The widening confidence interval signals increasing uncertainty.

Explainability and Transparency

Although forecasting models can feel like black boxes, Kedify keeps them interpretable. Each model can be decomposed into seasonal and trend components (weekly, monthly, or custom) allowing visual inspection of whether the detected patterns make sense.

This decomposition helps operators trust (or challenge) predictions and fine-tune seasonality parameters as needed.

Introducing the MetricPredictor CRD

Kedify introduces a new Custom Resource Definition: MetricPredictor. It represents the forecasting model and its lifecycle.

apiVersion: kedify.io/v1alpha1
kind: MetricPredictor
metadata:
  name: rabbit
spec:
  source:
    active: true
    keda:
      kind: scaledobject
      name: my-so
      triggerName: rabbit
    retention: 6mo 
  model:
    type: Prophet
    defaultHorizon: 5m 
    retrainInterval: 6h 
    prophetConfig:
      holidays:
        countryCode: US 
        strength: 10 
      seasonality:
        yearly: "true"
        weekly: "true"
        daily: "true"
        custom:
          name: twoHours
          period: 2h
          fourierOrder: 10
      changepointPriorScale: 0.05

Key points:

  • source defines where metrics come from — either a live ScaledObject or a one-shot CSV for testing or bootstrapping the model with existing data.

  • model defines the forecasting method (Prophet by default) and training cadence.

  • Kedify retrains models automatically on schedule.

Using Predictions in ScaledObjects

Once a MetricPredictor is created and trained, its predictions can be referenced directly in a ScaledObject trigger. This allows Kedify to scale based on forecasted metrics alongside real-time ones.

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: my-so
spec:
  triggers:
    - name: rabbit
      type: rabbitmq
      metadata:
        queueName: tasks
        value: "1"
    - name: rabbitFromNearFuture
      type: kedify-predictive
      metadata:
        modelName: default*rabbit
        modelMapeThreshold: "40"
        highMapeDefaultReturnValue: "1"
      targetValue: "1"
  advanced:
    scalingModifiers:
      formula: "(rabbit + rabbitFromNearFuture)/2"
      target: "1"
      metricType: "AverageValue"

Benefits of Predictive Scaling

  • Reduced reaction lag — the cluster scales before spikes hit.

  • Improved UX — fewer cold starts and latency dips.

  • Explainable models — operators can inspect trends and patterns.

  • Robustness — automatic fallback to defaults when model confidence drops.

  • Continuous learning — retraining keeps models aligned with current traffic.

  • Stability - modelMapeThreshold and highMapeDefaultReturnValue provide safety nets against poor model performance.

Get Started

🔴 Get my DevOps & Kubernetes ebooks! (free for Premium Club and Personal Tier newsletter subscribers)

Looking to promote your company, product, service, or event to 57,000+ DevOps and Cloud Professionals? Let's work together.