In partnership with

TechOps Examples

Hey — It's Govardhana MK 👋

Welcome to another technical edition.

Every Tuesday – You’ll receive a free edition with a byte-size use case, remote job opportunities, top news, tools, and articles.

Every Thursday and Saturday – You’ll receive a special edition with a deep dive use case, remote job opportunities, and articles.

👋 👋 A big thank you to today's sponsor WISPR FLOW

Write docs 4x faster. Without hating every second.

Nobody became a developer to write documentation. But the docs still need to get written — PRDs, README updates, architecture decisions, onboarding guides.

Wispr Flow lets you talk through it instead. Speak naturally about what the code does, how it works, and why you built it that way. Flow formats everything into clean, professional text you can paste into Notion, Confluence, or GitHub.

Used by engineering teams at OpenAI, Vercel, and Clay. 89% of messages sent with zero edits. Works system-wide on Mac, Windows, and iPhone.

Looking to promote your company, product, service, or event to 56,000+ Cloud Native Professionals? Let's work together. Advertise With Us

🧠 DEEP DIVE USE CASE

How Kubernetes Manages CPU and Memory Resources

Every engineer who has run Kubernetes in production has hit the same wall. A pod gets OOMKilled. A node goes NotReady because something consumed all its CPU. A deployment that worked fine in staging causes cascading failures in production because nobody set resource requests. These are not edge cases. They are the default outcome when you do not understand how Kubernetes actually thinks about resources.

Kubernetes does not manage CPU and memory the way an operating system does. It manages them through a layered system of declarations, scheduling decisions, enforcement mechanisms, and eviction policies that interact in ways that are non-obvious until you have been burned by them. Today we walk through that system from the ground up.

How a Pod Lands on a Node (The Scheduling Decision)

Before a pod runs anywhere, the scheduler must decide where it belongs. That decision is based entirely on resource requests, not limits, not actual usage.

The critical insight here is the gap between capacity and allocatable. A node with 4 CPU cores does not give 4000m to your pods. After kubelets reserved overhead, system-reserved space for OS processes, and the eviction threshold buffer that Kubernetes keeps to ensure it can evict pods before the node fully collapses, you might have 3500m of allocatable CPU. This is why pods can go Pending on a node that looks half empty in monitoring dashboards that show actual utilization.

The scheduler is also completely blind to what pods are actually consuming at the moment it makes its decision. If you have a node with 3500m requested but all those pods are idle at 50m actual usage, the scheduler still sees a nearly full node and routes new pods elsewhere. This is the requests vs actual divergence that causes mysterious pending pods on clusters that look like they have plenty of headroom.

CPU Throttling (what limits actually do at runtime)

Once a pod is scheduled and running, the limits become the runtime enforcement layer. CPU and memory limits behave in fundamentally different ways, and conflating them is one of the most common sources of production incidents.

Here we capture the most important asymmetry in Kubernetes resource management. CPU throttling and memory OOM killing are completely different mechanisms with completely different outcomes.

CPU limits enforce a quota in the Linux CFS scheduler. Every 100ms period, a container gets a proportional slice of CPU time. A container with cpu: 500m gets 50ms of CPU quota per 100ms period. When it exhausts its quota for a period, it gets throttled: it cannot run until the next period starts, even if the node has abundant idle CPU sitting unused. The container stays alive. Jobs take longer. Latency climbs. The process does not know it is being throttled.

Memory limits are binary. When a container's RSS reaches its memory limit, the kernel OOM killer terminates it immediately. No warning, no graceful shutdown, exit code 137. Kubernetes restarts it. If the memory leak is in the application, it will hit the limit again. This is CrashLoopBackOff.

The QoS class that Kubernetes assigns to your pod determines eviction priority when a node comes under memory pressure. This is derived purely from how you configure requests and limits, not from any explicit QoS setting you specify. Setting requests == limits on both CPU and memory gives you Guaranteed class, which means the node will evict every Burstable and BestEffort pod before touching yours.

🔴 Get my DevOps & Kubernetes ebooks! (free for Premium Club and Personal Tier newsletter subscribers)

logo

Upgrade to Paid to read the rest.

Become a paying subscriber to get access to this post and other subscriber-only content.

Upgrade

Paid subscriptions get you:

  • Access to archive of 250+ use cases
  • Deep Dive use case editions (Thursdays and Saturdays)
  • Access to Private Discord Community
  • Invitations to monthly Zoom calls for use case discussions and industry leaders meetups
  • Quarterly 1:1 'Ask Me Anything' power session

Keep Reading