Core Concepts
Rules
The shape of an alert rule in Kubeadapt — scope, enabled types, per-type conditions, and the rule states that decide whether it fires.
You're setting up alerting for a new cluster and you need to know exactly what a Rule contains before you fill in any forms. This page lays out every field, what it controls, and how one Rule can fire on several conditions simultaneously without becoming unmanageable.
What a Rule is
A Rule is a named bundle of: a scope (where to look), an enabled-types slot collection (what to look for), a notification policy reference (where to send the alert), and a state (whether the rule is currently firing). One Rule produces zero or more incidents over its lifetime.
A Rule does not specify a channel directly. Channel routing lives on the Policy — see Policies.
The fields
A Rule has the following user-editable fields:
- Name — short label, shown everywhere. "Production cost spike", "Finance budget — eu-prod", "Idle resources — staging".
- Scope —
kind(organization / cluster / namespace / team / department) plus the ID and a human-readable breadcrumb. The scope is fixed at create time; to retarget, create a new rule. - Enabled types — a slot per alert type (
cost_spike,budget_threshold,unused_resources,new_expensive_workload). Each slot is eithernull(type disabled) or carries that type's conditions. At least one slot must be non-null. - Policy — the Policy that routes this rule's incidents to channels.
- Notify on resolve — whether to send a resolution notification when the rule's condition clears.
Of those, the enabled types field deserves a closer look.
One rule, several types
The enabled-types collection lets one Rule fire on more than one condition without duplicating the scope or the channel routing. Example: a single "Production cluster — finance" Rule can enable both cost_spike (for daily spike alerts) and budget_threshold (for monthly budget warnings), each with its own thresholds, both routed to the same Slack channel via one Policy.
Each non-null slot stores the conditions for that type. They're independent — turning off cost_spike does not clear the budget_threshold thresholds. The validation rule is "at least one non-null slot"; everything else is optional.
Per-type conditions
Cost spike
The only configurable field is Sensitivity mode:
| Mode | Behavior |
|---|---|
sustained | Only fires if the spike holds across two consecutive completed buckets. Quiet, fewer false positives. Default. |
immediate | Fires on every qualifying jump, including brief ones. Noisy, useful for small clusters where one bad day matters. |
The threshold and the floor that decide what counts as "qualifying" are derived from the scope's spending pattern automatically — nothing to tune on the rule. The rule editor shows a banner explaining this when you enable the type. See Smart Alerting overview for the baseline math.
Budget threshold
Five fields:
- Budget amount ($) — the period total you've allocated to this scope.
- Period —
monthly,quarterly, orfiscal year. Monthly resets on the 1st, quarterly on calendar quarter, fiscal on your org's fiscal year start. - Cost mode —
loaded(CPU + RAM + GPU + overhead) orworkload_only(direct workload cost, no shared overhead). Match this to how you decided the budget. - Notify at thresholds — pick any subset of
50%,80%,90%,100%. At least one required. The rule fires once when each selected threshold is first crossed in the period. - Rollover — currently
fresh(a clean reset at the start of each period). The field is stored for future extension.
Period-to-date cost is recomputed on every snapshot, so threshold crossings show up within minutes of crossing.
Unused resources
Three fields:
- Minimum idle percentage (
min_idle_pct) — only flag a resource if its idle fraction exceeds this. Default 80%. - Schedule — pinned to
weekly. Stored as a field for future extension; today the rule always runs weekly. - Top-N cap (
top_n_cap) — the digest lists at most this many idle items, sorted by cost descending. Default 20.
The digest delivery means you won't get pinged the moment a workload goes idle — you'll see it in the next weekly summary. Use this type to keep weekly cleanup on the team's calendar, not as a real-time alert.
New expensive workload
Four fields:
- Minimum cost ($/month) — only flag workloads projected to cost at least this much per month. Default $750.
- Minimum age (hours) — wait this long after the workload first appears before evaluating. Default 2 hours. Lets short-lived workloads finish without firing, and lets cost projections stabilize.
- Auto-exclude CRD-owned workloads — skip anything whose top-level owner is a custom resource (operator-managed). Recommended on. Catches Argo Workflows, Spark Operator, Ray, Kubeflow training jobs.
- Exclude owner kinds — explicit list of owner Kinds that should never trigger this rule. Defaults cover
Job,CronJob, and common ML operators. Add your own platform's Kinds as needed.
Rule state
A Rule is always in one of four states. The state decides whether evaluation happens and whether incidents fire.
| State | Meaning |
|---|---|
active | The rule runs on schedule and incidents fire normally. |
muted | The rule still runs and records detections, but no notifications go out. Time-bound. |
disabled | The rule does not run. Preserved but inert. Use during long pauses. |
degraded | The rule hit a data-quality issue (missing cost data, stale cluster updates). Auto-recovers. |
Muting takes a duration (4 hours, 24 hours, 7 days, or forever-until-reactivated). Disable is the right choice when you know you won't need the rule for weeks but want to keep its history.
Previewing a rule before saving
The rule editor previews a draft rule against the last 30 days of data and shows:
- Notification volume —
low,medium, orhighbucket based on how many times the rule would have fired in the preview window. - Warning — flags noisy rules (
noisy), rules that would have fired so often they should be blocked (block_noisy), budgets that look implausibly small (tiny_budget), or near-duplicates of existing rules (duplicate_rule).
The preview is a guide, not a forecast. A change in your cluster's behavior tomorrow won't be in the preview window today.
Editing a rule
Most fields can be changed in place. Two cannot:
- Scope kind, scope ID — fixed at creation. Retargeting requires a new rule.
- Enabled-types schema — you can flip a slot on or off, but the schema of each slot's conditions is dictated by the alert type.
Changing the policy, the conditions of an enabled type, or the rule's state takes effect on the next scheduled evaluation.
Deleting a rule
Deleting a rule removes it from the rule list and stops further evaluation. Existing incidents are preserved in the incidents archive so post-mortems remain accurate. To suppress the incident history too, delete the linked policy first.
What's not here
- No multi-scope rules. One rule, one scope. Use a Policy to route multiple rules to the same channels instead.
- No conditional logic between types. Enabled types fire independently — there's no "fire
cost_spikeonly ifbudget_thresholdalso fires" wiring. - No custom alert types. The four types in the rule editor are the four types.