Back to Changelog

March 10, 2026

Improvement

Introducing kubeadapt-agent: Lightweight Kubernetes Metrics Collector, Rebuilt from Scratch

The original agent is now deprecated. kubeadapt-agent drops every heavyweight dependency, needs only metrics-server, and auto-discovers GPU workloads with a single Deployment.

Introducing kubeadapt-agent: Lightweight Kubernetes Metrics Collector, Rebuilt from Scratch
Ugurcan Caykara
Ugurcan Caykara

Why we rebuilt the agent

The original Kubeadapt agent required Prometheus, OpenCost, node-exporter, and kube-state-metrics before it could collect a single metric. Every cluster needed a full monitoring stack installed up front. Setup was slow, RBAC was complex, and upgrades broke things.

kubeadapt-agent is a ground-up rewrite with a single dependency: metrics-server.

What changed

No more Prometheus, OpenCost, node-exporter, kube-state-metrics

All four dependencies are gone. kubeadapt-agent reads resource usage directly from the Kubernetes Metrics API. The only prerequisite is metrics-server.

GPU metrics, automatically

kubeadapt-agent detects NVIDIA GPUs at startup. If dcgm-exporter is running anywhere in your cluster, the agent finds it and starts collecting GPU utilization, memory, temperature, and power draw per device. See the GPU monitoring guide for details.

GPU Sharing Limitations

GPU time-slicing, MPS, and MIG configurations have limited per-workload attribution due to DCGM Exporter constraints. GPU right-sizing works at the node level only in shared GPU setups. See the GPU monitoring guide for specifics.

Real-time collection

The old agent re-fetched every resource on a fixed interval, pulling the same data over and over. kubeadapt-agent uses Kubernetes watch connections: it syncs once at startup, then receives only change events in real time. When a pod scales or a node joins, the agent knows instantly.

It tracks 20+ resource types by default: nodes, pods, deployments, statefulsets, daemonsets, jobs, HPAs, VPAs, persistent volumes, and more.

Auto-discovery

On startup the agent probes the cluster and adapts to what's available:

  • Cloud provider (AWS, GCP, Azure)
  • VPA and Karpenter if installed
  • GPU nodes and NVIDIA metrics exporters
  • metrics-server availability

Pre-computed enrichment

Before each snapshot is sent, the agent enriches raw metrics automatically:

  • Workload mapping: Every pod is traced back to its parent deployment, statefulset, or daemonset. No orphaned metrics.
  • Resource totals: Cluster-wide CPU, memory, GPU, and storage usage are pre-calculated and ready for the dashboard.

Before and after

Old agentkubeadapt-agent
External dependencies4 (Prometheus, OpenCost, node-exporter, kube-state-metrics)1 (metrics-server)
GPU supportManual configAuto-discovered
Data collectionPeriodic full-list API callsInformer-based (real-time)
Cloud/VPA/KarpenterManual or N/AAuto-detected
Health reportingBasicFull diagnostics per snapshot

Migration

Breaking Change

This is a 2.0 release on the same Helm chart. The old agent binary is fully replaced. Run helm upgrade to migrate. The old agent's Prometheus/OpenCost dependencies can be removed after upgrade if nothing else uses them.

Follow the quick-start guide for installation steps.

agentkubernetesgpuperformancebreaking-change
Want to see this in action?See Your Savings Potential →