Right-sizing Guide

Kubeadapt takes a request-centric approach to right-sizing: recommendations target resource requests (the primary cost driver), computed from actual usage percentiles with dynamic buffers for growth trends and usage variability. The values you apply already include built-in headroom.

Two default analysis profiles control how aggressive the recommendations are:

Profile	CPU Percentile	Memory Percentile	Approach
Production	P99	P99	Conservative: keeps generous headroom above peak usage
Non-production	P50	P50	Aggressive: optimizes for cost, safe for non-production

For the full breakdown of how recommendations are computed (buffer formulas, limit strategies, environment-specific policies), see Right-sizing Concepts.

Prerequisites

Resource requests and limits defined on your workloads (see Best Practices to find workloads missing them)
kubectl access to your cluster
A non-production environment for initial testing (recommended)

Step-by-Step Process

Step 1: Identify Candidates

Go to Dashboard → Available Savings
Filter by "Right-Sizing" category
Sort by "Savings" to prioritize high-impact workloads

Step 2: Review the Recommendation

Click a workload to open the details view. Check the resource usage charts for:

P99 Usage to understand spike patterns
Volatility to gauge usage stability
Incidents like OOM kills or CPU throttling

Then review the recommended values.

Tip

Growth trends and usage variability are already factored into the recommendation. Before applying, check for recent OOM kills, restarts, or unusual scaling events - these can skew the data.

Step 3: Test in a Non-Production Environment

Apply the recommended values to a non-production deployment first. Monitor the workload in the Kubeadapt dashboard before touching production.

Tip

Resource utilization profiles vary by business. If your traffic has peak utilization at certain hours, spiky bursts spread unevenly throughout the day, or quiet periods followed by sudden load, waiting at least 24 hours lets Kubeadapt capture the full daily cycle including peak CPU and memory values. If traffic varies by day of the week, waiting 7 days gives our analyzer a complete weekly cycle to work with, resulting in a more accurate recommendation.

Step 4: Apply to Production

Copy the recommended values from the workload details page:

Resource	Current	Recommended
CPU Request	1000m	400m
CPU Limit	2000m	800m
Memory Request	2Gi	768Mi
Memory Limit	4Gi	1536Mi

Apply with kubectl patch:

bash

kubectl patch deployment <name> -n <namespace> --type='json' -p='[
  {"op": "replace", "path": "/spec/template/spec/containers/0/resources/requests/cpu", "value": "400m"},
  {"op": "replace", "path": "/spec/template/spec/containers/0/resources/requests/memory", "value": "768Mi"},
  {"op": "replace", "path": "/spec/template/spec/containers/0/resources/limits/cpu", "value": "800m"},
  {"op": "replace", "path": "/spec/template/spec/containers/0/resources/limits/memory", "value": "1536Mi"}
]'

Or use kubectl set resources for a shorter command:

bash

kubectl set resources deployment <name> -n <namespace> \
  --requests=cpu=400m,memory=768Mi \
  --limits=cpu=800m,memory=1536Mi

After applying, watch the rollout and check for pod restarts:

bash

kubectl rollout status deployment/<name> -n <namespace>

Kubernetes 1.35+

Kubernetes 1.35 enabled in-place pod vertical scaling (stable, no feature gate required). If your cluster is on 1.35 or later, CPU and memory changes can be applied to running pods without triggering a rolling restart. Upgrading to 1.35 simplifies the rightsizing workflow. See Vertical Right-sizing and In-Place Updates for details.

Warning

If your organization has a DevOps or platform team that manages deployments through Helm or Kustomize with a GitOps controller (ArgoCD, Flux CD), coordinate with them before applying changes directly. Running the update through the GitOps workflow ensures an audit trail in Git and prevents the controller from reverting your manual patch on the next sync.

Step 5: Confirm Savings

Check the Kubeadapt dashboard over the following days to confirm the cost reduction matches what was projected.

Note

The recommendation engine detects configuration changes automatically. Once the new resource values are applied and monthly savings drop below the policy threshold, recommendations for that workload stop until usage patterns change significantly.

Gradual Rollout

Don't try to optimize every workload at once. Spread the work over a few weeks:

Start with non-production. Pick a handful of non-critical workloads in a non-production cluster. Apply recommendations, watch for a week, get comfortable with the process.
Move to non-critical production workloads. Internal tools, batch jobs, low-traffic services, anything where a brief disruption has minimal business impact.
Tackle critical production workloads. Apply recommendations one service at a time. Monitor each rollout before moving to the next.
Finish up. Cover remaining workloads, tally total savings, and note anything you learned along the way.