Core Concepts

Best practices

How Kubeadapt grades a cluster against Kubernetes best-practice checks, the categories the score covers, and how to triage findings.

A new engineer takes over a cluster from a team that has rotated off. They want one number that says "is this cluster well-configured?" before they commit to any rightsizing or spot moves. The Best Practices page gives them that number, and the list of every finding behind it, grouped by what to fix first.

What the score is

A compliance score from 0 to 100. Higher is better. The score is computed from the open findings in the cluster, weighted by severity. A cluster with zero open findings scores 100; a cluster with several critical findings scores in the 40s or below.

The UI colors the score by tone:

Score range	Tone	Interpretation
80–100	Green	Healthy. Address remaining warnings as time permits.
60–79	Amber	Mixed. Several warnings, possibly a critical or two.
0–59	Red	Multiple critical issues. Triage immediately.

The score moves as soon as a finding is opened or closed by the agent, so the same workload going from runAsRoot: true to runAsNonRoot: true lifts the score on the next snapshot.

The check categories

Findings are organized into six categories. Each category contains multiple checks; each check produces one finding per affected resource.

Security — pod security context, capabilities, host namespaces, network policies, service account hygiene. 11 checks ranging from Container Running as Root (critical) to Service Account Token Auto-mounted (info).
Reliability — replicas, PodDisruptionBudgets, anti-affinity, topology spread, readiness/liveness/startup probes, termination grace periods. 9 checks. Anything missing here trades availability for simplicity.
Resource configuration — requests, limits, LimitRange, ResourceQuota, init-container resources, and storage-related checks (EmptyDir without size limit, HostPath volume use). 8 checks. The category Kubeadapt cares about most because right-sizing depends on it.
Workload configuration — Recreate update strategy, excessively long termination grace periods. 2 checks at the deployment-strategy level.
Scheduling and placement — placement constraints, wildcard tolerations. 2 checks at the pod-scheduling level.
Namespace organization — workloads in the default namespace, missing owner annotations. 2 checks at the cluster-hygiene level.

Each check has a stable ID (SEC-001, REL-001, RES-001, and so on) and a fixed severity. The IDs are stable across releases so you can grep for them in your own CI checks or runbooks.

Severities

Three levels, each with consistent semantics:

Severity	Triage expectation
`critical`	Real security or availability impact. Fix the same week. Single-replica deployments, no readiness probe, privileged containers, no resource requests.
`warning`	Best practice not followed. Fix during normal sprint planning. No PDB, no liveness probe, no network policy, writable root filesystem.
`info`	Recommendation, often situational. Address opportunistically. No startup probe, CPU limit equals request, missing owner annotation.

A critical at the cluster level (no network policy in a namespace) costs the score more than an info at the cluster level (workloads in the default namespace). The exact weighting is internal; the takeaway is that critical findings move the score and info findings barely move it.

How findings are presented

A finding is one check against one resource. The check is REL-001 — Single Replica Deployment. The resource is payment-service in namespace production. That combination becomes one row.

Each finding carries:

Title and description — what the check looks for and why it matters.
Severity — critical / warning / info.
Resource — name, kind (Deployment, StatefulSet, DaemonSet, CronJob, Job), namespace.
Status — open (active, counted in the score), acknowledged (seen by a human, not yet fixed, still counted), closed (resolved, no longer counted).
Created at — when Kubeadapt first observed the violation.
Remediation — a short, numbered list of the changes to make. Most are 2–4 lines.

The detail panel shows the YAML snippet that triggered the check, the remediation steps, and the links to the Kubernetes docs page for the underlying concept.

How to act on findings

A useful triage order:

Start with critical, open. Filter status to open and severity to critical. Sort by check ID so checks of the same type cluster — fixing them in batch is faster than one at a time.
Group by check, not by resource. Twelve resources missing readiness probes is one PR if they all share a Helm chart. Pick a category (Reliability), pick a check (No Readiness Probe), see all affected workloads.
Acknowledge the long tail. Findings you'll address but not this week: open the row, click Acknowledge. The finding stays counted in the score (it's still a real problem), but it's marked as seen so the queue stops looking unattended.
Re-score after deploying. The agent reports the change on the next snapshot. The finding flips to closed and the score adjusts within minutes.

What the page shows

Per-cluster page at Clusters → [cluster] → Best Practices. Top of the page:

Compliance Score — <n> / 100, colored by tone.
Critical Issues — count of severity: critical open findings.
Warnings — count of severity: warning open findings.
Info — count of severity: info open findings.

The body splits into a category sidebar on the left and a check / findings list on the right. Select a category to see the checks inside it; select a check to see every affected resource. The status filter (open / acknowledged / closed) applies globally so you can switch from "what's broken today" to "what's been resolved this quarter" without losing your place.

What's not here

No automatic remediation. Each finding lists the steps; the change happens in your manifests.
No custom checks. The check set ships with the product. If you want cluster-specific rules, run them in CI alongside the Best Practices feed.
No multi-cluster comparison. Scores are per-cluster. To see your fleet, look at each cluster's score from the Clusters list.
No drift detection. Kubeadapt reports the current state. A workload that was compliant yesterday and isn't today shows up as a new open finding, not as a "drifted" finding.

Next steps

Right-sizing — RES-001 and RES-002 are direct inputs to right-sizing recommendations.
Spot migration — REL-001 and REL-002 are the same blockers Spot Migration calls out.

Core Concepts

Best practices

How Kubeadapt grades a cluster against Kubernetes best-practice checks, the categories the score covers, and how to triage findings.

What the score is

The UI colors the score by tone:

Score range	Tone	Interpretation
80–100	Green	Healthy. Address remaining warnings as time permits.
60–79	Amber	Mixed. Several warnings, possibly a critical or two.
0–59	Red	Multiple critical issues. Triage immediately.

The score moves as soon as a finding is opened or closed by the agent, so the same workload going from runAsRoot: true to runAsNonRoot: true lifts the score on the next snapshot.

The check categories

Findings are organized into six categories. Each category contains multiple checks; each check produces one finding per affected resource.

Security — pod security context, capabilities, host namespaces, network policies, service account hygiene. 11 checks ranging from Container Running as Root (critical) to Service Account Token Auto-mounted (info).
Reliability — replicas, PodDisruptionBudgets, anti-affinity, topology spread, readiness/liveness/startup probes, termination grace periods. 9 checks. Anything missing here trades availability for simplicity.
Resource configuration — requests, limits, LimitRange, ResourceQuota, init-container resources, and storage-related checks (EmptyDir without size limit, HostPath volume use). 8 checks. The category Kubeadapt cares about most because right-sizing depends on it.
Workload configuration — Recreate update strategy, excessively long termination grace periods. 2 checks at the deployment-strategy level.
Scheduling and placement — placement constraints, wildcard tolerations. 2 checks at the pod-scheduling level.
Namespace organization — workloads in the default namespace, missing owner annotations. 2 checks at the cluster-hygiene level.

Each check has a stable ID (SEC-001, REL-001, RES-001, and so on) and a fixed severity. The IDs are stable across releases so you can grep for them in your own CI checks or runbooks.

Severities

Three levels, each with consistent semantics:

Severity	Triage expectation
`critical`	Real security or availability impact. Fix the same week. Single-replica deployments, no readiness probe, privileged containers, no resource requests.
`warning`	Best practice not followed. Fix during normal sprint planning. No PDB, no liveness probe, no network policy, writable root filesystem.
`info`	Recommendation, often situational. Address opportunistically. No startup probe, CPU limit equals request, missing owner annotation.

How findings are presented

A finding is one check against one resource. The check is REL-001 — Single Replica Deployment. The resource is payment-service in namespace production. That combination becomes one row.

Each finding carries:

Title and description — what the check looks for and why it matters.
Severity — critical / warning / info.
Resource — name, kind (Deployment, StatefulSet, DaemonSet, CronJob, Job), namespace.
Status — open (active, counted in the score), acknowledged (seen by a human, not yet fixed, still counted), closed (resolved, no longer counted).
Created at — when Kubeadapt first observed the violation.
Remediation — a short, numbered list of the changes to make. Most are 2–4 lines.

The detail panel shows the YAML snippet that triggered the check, the remediation steps, and the links to the Kubernetes docs page for the underlying concept.

How to act on findings

A useful triage order:

Start with critical, open. Filter status to open and severity to critical. Sort by check ID so checks of the same type cluster — fixing them in batch is faster than one at a time.
Group by check, not by resource. Twelve resources missing readiness probes is one PR if they all share a Helm chart. Pick a category (Reliability), pick a check (No Readiness Probe), see all affected workloads.
Acknowledge the long tail. Findings you'll address but not this week: open the row, click Acknowledge. The finding stays counted in the score (it's still a real problem), but it's marked as seen so the queue stops looking unattended.
Re-score after deploying. The agent reports the change on the next snapshot. The finding flips to closed and the score adjusts within minutes.

What the page shows

Per-cluster page at Clusters → [cluster] → Best Practices. Top of the page:

Compliance Score — <n> / 100, colored by tone.
Critical Issues — count of severity: critical open findings.
Warnings — count of severity: warning open findings.
Info — count of severity: info open findings.

What's not here

No automatic remediation. Each finding lists the steps; the change happens in your manifests.
No custom checks. The check set ships with the product. If you want cluster-specific rules, run them in CI alongside the Best Practices feed.
No multi-cluster comparison. Scores are per-cluster. To see your fleet, look at each cluster's score from the Clusters list.
No drift detection. Kubeadapt reports the current state. A workload that was compliant yesterday and isn't today shows up as a new open finding, not as a "drifted" finding.

Next steps

Right-sizing — RES-001 and RES-002 are direct inputs to right-sizing recommendations.
Spot migration — REL-001 and REL-002 are the same blockers Spot Migration calls out.

Best practices

What the score is

The check categories

Severities

How findings are presented

How to act on findings

What the page shows

What's not here

Next steps

Related

Best practices

What the score is

The check categories

Severities

How findings are presented

How to act on findings

What the page shows

What's not here

Next steps

Related