- TechOps Examples
- Posts
- How Kubernetes Observability Works Across Layers
How Kubernetes Observability Works Across Layers
TechOps Examples
Hey โ It's Govardhana MK ๐
Along with a use case deep dive, we identify the remote job opportunities, top news, tools, and articles in the TechOps industry.
๐ Before we begin... a big thank you to today's sponsor HUBSPOT
Want to get the most out of ChatGPT?
ChatGPT is a superpower if you know how to use it correctly.
Discover how HubSpot's guide to AI can elevate both your productivity and creativity to get more things done.
Learn to automate tasks, enhance decision-making, and foster innovation with the power of AI.
๐ Happy to bring you the trusted source, separate noise from knowledge.
Looking for unbiased, fact-based news? Join 1440 today.
Join over 4 million Americans who start their day with 1440 โ your daily digest for unbiased, fact-centric news. From politics to sports, we cover it all by analyzing over 100 sources. Our concise, 5-minute read lands in your inbox each morning at no cost. Experience news without the noise; let 1440 help you make up your own mind. Sign up now and invite your friends and family to be part of the informed.
IN TODAY'S EDITION
๐ง Use Case
How Kubernetes Observability Works Across Layers
๐ Top News
๐ Remote Jobs
Abhyaz is hiring a Cloud DevOps Engineer
Remote Location: Worldwide
Atlassian is hiring a Principal Data Platform Engineer
Remote Location: India
๐๏ธ Resources
๐ข Reddit Threads
๐ ๏ธ TOOL OF THE DAY
openinfraquote - Fast, open-source tool for estimating infrastructure costs from Terraform plans and state files.
๐ง USE CASE
How Kubernetes Observability Works Across Layers
In one of our production clusters, we had Prometheus, Grafana, Fluentd, and still spent too long debugging incidents. The turning point wasnโt more tools, it was wiring them right and thinking in layers.
Iโve made this simple and self explanatory illustration for someone new to Kubernetes observability layers.

Download a high resolution copy of this animation diagram here for future reference.
Coming back to the context, hereโs what made the difference.
1. Tie everything to workloads, not nodes
We tagged every log and metric with workload
, namespace
, and container
.
That made it possible to trace issues end-to-end. Developers could see what failed, where, and why, without stepping into infra dashboards.
2. Forward Kubernetes events, not just logs and metrics
We deployed kube-eventer
and pushed events into Elasticsearch.
That surfaced OOMKills, CrashLoops, image pull errors, and pod evictions that metrics often miss and logs scatter. Events became our fastest source of early signals.
3. Alert routing based on ownership, not severity
We used Alertmanager matchers to route alerts by team.
Platform teams got node and network alerts. App teams got alerts scoped to their own workloads. This cut down alert fatigue and made on-call response faster and more focused.
4. Fluentd for structured forwarding
We used Fluentd with kubernetes_metadata_filter
to enrich logs and forked them to both Loki and OpenSearch.
Why both?
Loki was used for quick, recent queries inside Grafana. Lightweight, fast, and tightly integrated with Kubernetes.
OpenSearch handled longer retention and full-text search. Perfect for audit logs, compliance, and historic analysis.
This combo gave us fast incident response and deep postmortem capability, without overloading a single system.
5. Dashboards that match user context
We built scoped Grafana dashboards per team. Each team saw only their namespace, pods, and workloads.
This wasn't about hiding things, it was about clarity. Teams started using dashboards daily instead of once a week.
One change you can try today
Tag your logs and metrics consistently with workload
and namespace
.
That one step unlocked real observability for us and cut triage time nearly in half.
Kubernetes Port Forwarding Patterns ๐
You're inside a dev cluster, access to the network is restricted, and the service you just deployed is acting up.
Thereโs no LoadBalancer, no ingress controller in place, and opening up a port is not an option because the network team is
โ Govardhana Miriyala Kannaiah (@govardhana_mk)
3:09 PM โข May 5, 2025