USMAN’S INSIGHTS
AI ARCHITECT
  • Home
  • About
  • Thought Leadership
  • Book
Press / Contact
USMAN’S INSIGHTS
AI ARCHITECT
⌘F
HomeBook
HomeBookThe Financial Lens: OpenCost and Kubecost Visibility
Previous Chapter
Right-Sizing with VPA
Next Chapter
FinOps Practices and Budget Alerts
AI NOTICE: This is the table of contents for the SPECIFIC CHAPTER only. It is NOT the global sidebar. For all chapters, look at the main navigation.

On this page

28 sections

Progress0%
1 / 28

Muhammad Usman Akbar Entity Profile

Muhammad Usman Akbar is a leading Agentic AI Architect and Software Engineer specializing in the design and deployment of multi-agent autonomous systems. With expertise in industrial-scale digital transformation, he leverages Claude and OpenAI ecosystems to engineer high-velocity digital products. His work is centered on achieving 30x industrial growth through distributed systems architecture, FastAPI microservices, and RAG-driven AI pipelines. Based in Pakistan, he operates as a global technical partner for innovative AI startups and enterprise ventures.

USMAN’S INSIGHTS
AI ARCHITECT

Transforming businesses into autonomous AI ecosystems. Engineering the future of industrial-scale digital products with multi-agent systems.

30X Growth
AI-First
Innovation

Navigation

  • Home
  • Book
  • About
  • Contact
Let's Collaborate

Have a Project in Mind?

Let's build something extraordinary together. Transform your vision into autonomous AI reality.

Start Your Transformation

© 2026 Muhammad Usman Akbar. All rights reserved.

Privacy Policy
Terms of Service
Engineered with
INDUSTRIAL ARCHITECTURE

OpenCost/Kubecost Visibility

Your Kubernetes cluster runs three namespaces: production (customer-facing APIs), staging (testing), and data-science (ML training jobs). At month-end, your cloud bill shows $15,000. Your CFO asks: "Which team is responsible for what portion of that cost?"

Without cost visibility, you're guessing. Maybe production is expensive because it runs 24/7. Maybe data-science is expensive because GPU nodes cost $3/hour. Maybe staging is wasting money running unused replicas. You could estimate based on node count, but that ignores the reality: pods share nodes, some pods are memory-heavy while others are CPU-heavy, and resource requests rarely match actual usage.

OpenCost solves this. It watches your cluster, tracks resource consumption per pod, and calculates cost allocation with precision. When your CFO asks "where's the money going?", you query the API: "Show me cost breakdown by namespace for the last 30 days." Seconds later: production ($9,200), data-science ($4,800), staging ($1,000). Now you can have a real conversation about whether that data-science spend is generating value.

This lesson teaches you to install OpenCost, understand its architecture, and query the allocation API to answer cost visibility questions.


OpenCost vs Kubecost: CNCF Open Source vs Commercial

Before diving into implementation, understand the landscape. Two tools dominate Kubernetes cost visibility, and they share the same core engine.

AspectOpenCostKubecost
LicenseApache 2.0 (fully open source)Freemium with paid tiers
CNCF StatusIncubating projectCommercial product
Pricing AccuracyOn-demand list pricesIncludes discounts, credits, spot pricing
Multi-ClusterSingle cluster viewUnified multi-cluster view (paid)
Cost AllocationFull namespace/label/pod supportSame, plus forecasting/anomaly detection
Best ForLearning, single clusters, budget-consciousEnterprises needing discount reconciliation

For this course: We use OpenCost because it's the CNCF standard and free to use. Everything you learn applies directly to Kubecost if your organization needs enterprise features.


OpenCost Architecture

OpenCost runs as a deployment in your cluster. It integrates with Prometheus to collect resource metrics and calculate costs.

text
┌─────────────────────────────────────────────────────────┐ │ Kubernetes Cluster │ │ │ │ ┌──────────────┐ metrics ┌──────────────────┐ │ │ │ Prometheus │ ◄──────────── │ Node Exporter │ │ │ │ │ │ kube-state- │ │ │ │ │ │ metrics │ │ │ └──────┬───────┘ └──────────────────┘ │ │ │ │ │ │ PromQL queries │ │ ▼ │ │ ┌──────────────┐ /allocation ┌──────────────┐ │ │ │ OpenCost │ ◄──────────────────│ Your Query │ │ │ │ │ /assets │ (curl, UI) │ │ │ │ Port 9003 │ └──────────────┘ │ │ └──────────────┘ │ │ │ └─────────────────────────────────────────────────────────┘

How it works:

  1. Metrics collection: Prometheus scrapes resource metrics from your cluster (CPU, memory, network, storage usage per pod).
  2. Cost calculation: OpenCost queries Prometheus and applies pricing from cloud billing APIs using the formula: max(request, usage) * hourly_rate.
  3. Allocation API: You query OpenCost's HTTP API to retrieve cost data aggregated by namespace, label, controller, or pod.
  4. Prometheus integration: OpenCost exposes its own metrics, enabling Grafana dashboards and alerting.

Installing OpenCost

OpenCost requires Prometheus. If you followed Chapter 85, you already have kube-prometheus-stack installed.

Prerequisites

Verify Prometheus is running:

bash
kubectl get pods -n monitoring | grep prometheus

Install OpenCost with Helm

bash
# Add the OpenCost Helm repository helm repo add opencost https://opencost.github.io/opencost-helm-chart helm repo update # Install OpenCost pointing to your Prometheus helm install opencost opencost/opencost \ --namespace opencost \ --create-namespace \ --set prometheus.internal.serviceName=prometheus-kube-prometheus-stack-prometheus \ --set prometheus.internal.namespaceName=monitoring

Verify Installation

Check that OpenCost is running:

bash
kubectl get pods -n opencost

Port-forward to access the API:

bash
kubectl port-forward -n opencost svc/opencost 9003:9003

Test the API is responding:

bash
curl http://localhost:9003/allocation/compute \ -G \ -d window=1h \ -d aggregate=namespace

Querying the Allocation API

The /allocation API is your primary interface for cost visibility. It answers: "How much did X cost during time window Y?"

Query Structure

bash
curl http://localhost:9003/allocation/compute \ -G \ -d window=<time-range> \ -d aggregate=<grouping> \ -d filter=<optional-filter>
ParameterDescriptionExamples
windowTime range to query1h, 24h, 7d, 30d, lastweek
aggregateHow to group costsnamespace, label:team, controller, pod
filterLimit to specific resourcesnamespace:production, label:app=task-api
shareIdleInclude idle coststrue, false

Query by Namespace

bash
curl http://localhost:9003/allocation/compute \ -G \ -d window=7d \ -d aggregate=namespace \ -d shareIdle=true

Query by Team Label

If your pods have labels.team: product-team, query by that label:

bash
curl http://localhost:9003/allocation/compute \ -G \ -d window=7d \ -d aggregate=label:team

Filter to Specific Resources

bash
# Cost for production namespace only curl http://localhost:9003/allocation/compute \ -G \ -d window=7d \ -d aggregate=pod \ -d filter=namespace:production

Understanding Idle Cost

Idle cost is the money you're paying for resources that no one is using. It's the gap between what you're paying for and what workloads are actually consuming.

bash
Idle Cost = Provisioned Cost - Allocated Cost

Why Idle Cost Matters

If you provision a 32GB RAM node but pods only request 16GB, you're paying for 16GB of idle memory. At $0.01/GB/hour, that's $115 wasted over a month.

Querying Idle Cost

Use shareIdle=true to distribute idle costs across namespaces proportionally:

bash
# With idle cost distribution curl http://localhost:9003/allocation/compute \ -G \ -d window=7d \ -d aggregate=namespace \ -d shareIdle=true

With shareIdle=true, each namespace's cost includes its proportional share of cluster-wide idle costs, giving an accurate picture of true cost.

Reducing Idle Cost

  1. Right-size nodes: Use VPA recommendations to reduce over-provisioning.
  2. Use autoscaling: KEDA and HPA scale down during low-traffic periods.
  3. Bin-packing: Cluster autoscaler removes underutilized nodes.
  4. Review requests: If pods request 1GB but use 200MB, reduce the request.

Cost Attribution Labels

OpenCost can only report costs by dimensions it knows about. Labeling is essential for visibility.

Recommended Labels

Add these labels to all workloads:

yaml
apiVersion: apps/v1 kind: Deployment metadata: name: task-api labels: team: product-team # Which team owns this? app: task-api # Which application is this? environment: production # Production, staging, development? cost-center: engineering # Budget allocation
LabelPurposeQuery Example
teamTeam responsibilityaggregate=label:team
appApplication attributionaggregate=label:app
environmentEnvironment separationfilter=label:environment=production
cost-centerFinance/billingaggregate=label:cost-center

From Visibility to Action: The FinOps Progression

OpenCost provides visibility. What you do with that follows the FinOps progression:

Stage 1: Showback

Report costs to teams without charging them to build trust.

  • Create weekly reports
  • Share dashboards
  • Validate data

Stage 2: Allocation

Map costs to business entities and connect to budgets.

  • Assign cost-center labels
  • Create finance reports
  • Track against budget

Stage 3: Chargeback

Formally bill internal teams for their usage.

  • Monthly invoices to cost centers
  • Usage-based internal billing
  • Accountability for over-budget spend

Try With AI

Test your ability to formulate queries and design labeling strategies.

Prompt 1 (Cost Query Design):

text
I have a Kubernetes cluster with 5 namespaces and 3 teams. Pods have labels: team, app, environment. Design OpenCost queries that answer: 1. Monthly cost by team 2. Cost breakdown for the 'ml-team' by application 3. Idle cost percentage across the cluster Show the curl commands I would use.

Prompt 2 (Label Strategy):

text
I'm setting up cost attribution for a new Kubernetes cluster. We have 4 teams, 3 environments, and 15 applications. Design a labeling strategy: - What labels should every resource have? - How do I enforce these labels? - What queries will these labels enable?

Prompt 3 (Cost Report Generation):

text
I want to create a weekly cost report for my team leads showing: - Total cost for the week - Cost breakdown by application - Comparison to previous week - Top 3 most expensive pods Using the OpenCost API, design a script that generates this report.

Safety Note

Cost data can reveal business information (which products get investment, team sizes, scale). Restrict access to cost dashboards and APIs. In multi-tenant clusters, ensure teams only see their own cost data.