Core Concepts

Rules

The shape of an alert rule in Kubeadapt — scope, enabled types, per-type conditions, and the rule states that decide whether it fires.

You're setting up alerting for a new cluster and you need to know exactly what a Rule contains before you fill in any forms. This page lays out every field, what it controls, and how one Rule can fire on several conditions simultaneously without becoming unmanageable.

What a Rule is

A Rule is a named bundle of: a scope (where to look), an enabled-types slot collection (what to look for), a notification policy reference (where to send the alert), and a state (whether the rule is currently firing). One Rule produces zero or more incidents over its lifetime.

A Rule does not specify a channel directly. Channel routing lives on the Policy — see Policies.

The fields

A Rule has the following user-editable fields:

Name — short label, shown everywhere. "Production cost spike", "Finance budget — eu-prod", "Idle resources — staging".
Scope — kind (organization / cluster / namespace / team / department) plus the ID and a human-readable breadcrumb. The scope is fixed at create time; to retarget, create a new rule.
Enabled types — a slot per alert type (cost_spike, budget_threshold, unused_resources, new_expensive_workload). Each slot is either null (type disabled) or carries that type's conditions. At least one slot must be non-null.
Policy — the Policy that routes this rule's incidents to channels.
Notify on resolve — whether to send a resolution notification when the rule's condition clears.

Of those, the enabled types field deserves a closer look.

One rule, several types

The enabled-types collection lets one Rule fire on more than one condition without duplicating the scope or the channel routing. Example: a single "Production cluster — finance" Rule can enable both cost_spike (for daily spike alerts) and budget_threshold (for monthly budget warnings), each with its own thresholds, both routed to the same Slack channel via one Policy.

Each non-null slot stores the conditions for that type. They're independent — turning off cost_spike does not clear the budget_threshold thresholds. The validation rule is "at least one non-null slot"; everything else is optional.

Per-type conditions

Cost spike

Cost spike has no configurable fields. The threshold, the baseline floor, and the confirmation window that decide what counts as "qualifying" are all derived from the scope's spending pattern automatically — nothing to tune on the rule. The rule editor shows a banner explaining this when you enable the type. See Smart Alerting overview for the baseline math.

Budget threshold

Five fields:

Budget amount ($) — the period total you've allocated to this scope.
Period — monthly, quarterly, or fiscal year. Monthly resets on the 1st, quarterly on calendar quarter, fiscal on your org's fiscal year start.
Cost mode — loaded (CPU + RAM + GPU + overhead) or workload_only (direct workload cost, no shared overhead). Match this to how you decided the budget.
Notify at thresholds — pick any subset of 50%, 80%, 90%, 100%. At least one required. The rule fires once when each selected threshold is first crossed in the period.
Rollover — currently fresh (a clean reset at the start of each period). The field is stored for future extension.

Period-to-date cost is recomputed on every snapshot, so threshold crossings show up within minutes of crossing.

Unused resources

Three fields:

Minimum idle percentage (min_idle_pct) — only flag a resource if its idle fraction exceeds this. Default 80%.
Schedule — pinned to weekly. Stored as a field for future extension; today the rule always runs weekly.
Top-N cap (top_n_cap) — the digest lists at most this many idle items, sorted by cost descending. Default 20.

The digest delivery means you won't get pinged the moment a workload goes idle — you'll see it in the next weekly summary. Use this type to keep weekly cleanup on the team's calendar, not as a real-time alert.

New expensive workload

Four fields:

Minimum cost ($/month) — only flag workloads projected to cost at least this much per month. Default $750.
Minimum age (hours) — wait this long after the workload first appears before evaluating. Default 2 hours. Lets short-lived workloads finish without firing, and lets cost projections stabilize.
Auto-exclude CRD-owned workloads — skip anything whose top-level owner is a custom resource (operator-managed). Recommended on. Catches Argo Workflows, Spark Operator, Ray, Kubeflow training jobs.
Exclude owner kinds — explicit list of owner Kinds that should never trigger this rule. Defaults cover Job, CronJob, and common ML operators. Add your own platform's Kinds as needed.

Rule state

A Rule is always in one of four states. The state decides whether evaluation happens and whether incidents fire.

State	Meaning
`active`	The rule runs on schedule and incidents fire normally.
`muted`	The rule still runs and records detections, but no notifications go out. Time-bound.
`disabled`	The rule does not run. Preserved but inert. Use during long pauses.
`degraded`	The rule hit a data-quality issue (missing cost data, stale cluster updates). Auto-recovers.

Muting takes a duration (4 hours, 24 hours, 7 days, or forever-until-reactivated). Disable is the right choice when you know you won't need the rule for weeks but want to keep its history.

Previewing a rule before saving

The rule editor previews a draft rule against the last 30 days of data and shows:

Notification volume — low, medium, or high bucket based on how many times the rule would have fired in the preview window.
Warning — flags noisy rules (noisy), rules that would have fired so often they should be blocked (block_noisy), budgets that look implausibly small (tiny_budget), or near-duplicates of existing rules (duplicate_rule).

The preview is a guide, not a forecast. A change in your cluster's behavior tomorrow won't be in the preview window today.

Editing a rule

Most fields can be changed in place. Two cannot:

Scope kind, scope ID — fixed at creation. Retargeting requires a new rule.
Enabled-types schema — you can flip a slot on or off, but the schema of each slot's conditions is dictated by the alert type.

Changing the policy, the conditions of an enabled type, or the rule's state takes effect on the next scheduled evaluation.

Deleting a rule

Deleting a rule removes it from the rule list and stops further evaluation. Existing incidents are preserved in the incidents archive so post-mortems remain accurate. To suppress the incident history too, delete the linked policy first.

What's not here

No multi-scope rules. One rule, one scope. Use a Policy to route multiple rules to the same channels instead.
No conditional logic between types. Enabled types fire independently — there's no "fire cost_spike only if budget_threshold also fires" wiring.
No custom alert types. The four types in the rule editor are the four types.

Next steps

Policies — route a rule's incidents to one or more channels with a delivery mode and a daily cap.
Channels — configure the actual destinations.

Core Concepts

Rules

The shape of an alert rule in Kubeadapt — scope, enabled types, per-type conditions, and the rule states that decide whether it fires.

What a Rule is

A Rule does not specify a channel directly. Channel routing lives on the Policy — see Policies.

The fields

A Rule has the following user-editable fields:

Name — short label, shown everywhere. "Production cost spike", "Finance budget — eu-prod", "Idle resources — staging".
Scope — kind (organization / cluster / namespace / team / department) plus the ID and a human-readable breadcrumb. The scope is fixed at create time; to retarget, create a new rule.
Enabled types — a slot per alert type (cost_spike, budget_threshold, unused_resources, new_expensive_workload). Each slot is either null (type disabled) or carries that type's conditions. At least one slot must be non-null.
Policy — the Policy that routes this rule's incidents to channels.
Notify on resolve — whether to send a resolution notification when the rule's condition clears.

Of those, the enabled types field deserves a closer look.

One rule, several types

Per-type conditions

Cost spike

Budget threshold

Five fields:

Budget amount ($) — the period total you've allocated to this scope.
Period — monthly, quarterly, or fiscal year. Monthly resets on the 1st, quarterly on calendar quarter, fiscal on your org's fiscal year start.
Cost mode — loaded (CPU + RAM + GPU + overhead) or workload_only (direct workload cost, no shared overhead). Match this to how you decided the budget.
Notify at thresholds — pick any subset of 50%, 80%, 90%, 100%. At least one required. The rule fires once when each selected threshold is first crossed in the period.
Rollover — currently fresh (a clean reset at the start of each period). The field is stored for future extension.

Period-to-date cost is recomputed on every snapshot, so threshold crossings show up within minutes of crossing.

Unused resources

Three fields:

Minimum idle percentage (min_idle_pct) — only flag a resource if its idle fraction exceeds this. Default 80%.
Schedule — pinned to weekly. Stored as a field for future extension; today the rule always runs weekly.
Top-N cap (top_n_cap) — the digest lists at most this many idle items, sorted by cost descending. Default 20.

New expensive workload

Four fields:

Minimum cost ($/month) — only flag workloads projected to cost at least this much per month. Default $750.
Minimum age (hours) — wait this long after the workload first appears before evaluating. Default 2 hours. Lets short-lived workloads finish without firing, and lets cost projections stabilize.
Auto-exclude CRD-owned workloads — skip anything whose top-level owner is a custom resource (operator-managed). Recommended on. Catches Argo Workflows, Spark Operator, Ray, Kubeflow training jobs.
Exclude owner kinds — explicit list of owner Kinds that should never trigger this rule. Defaults cover Job, CronJob, and common ML operators. Add your own platform's Kinds as needed.

Rule state

A Rule is always in one of four states. The state decides whether evaluation happens and whether incidents fire.

State	Meaning
`active`	The rule runs on schedule and incidents fire normally.
`muted`	The rule still runs and records detections, but no notifications go out. Time-bound.
`disabled`	The rule does not run. Preserved but inert. Use during long pauses.
`degraded`	The rule hit a data-quality issue (missing cost data, stale cluster updates). Auto-recovers.

Muting takes a duration (4 hours, 24 hours, 7 days, or forever-until-reactivated). Disable is the right choice when you know you won't need the rule for weeks but want to keep its history.

Previewing a rule before saving

The rule editor previews a draft rule against the last 30 days of data and shows:

Notification volume — low, medium, or high bucket based on how many times the rule would have fired in the preview window.
Warning — flags noisy rules (noisy), rules that would have fired so often they should be blocked (block_noisy), budgets that look implausibly small (tiny_budget), or near-duplicates of existing rules (duplicate_rule).

The preview is a guide, not a forecast. A change in your cluster's behavior tomorrow won't be in the preview window today.

Editing a rule

Most fields can be changed in place. Two cannot:

Scope kind, scope ID — fixed at creation. Retargeting requires a new rule.
Enabled-types schema — you can flip a slot on or off, but the schema of each slot's conditions is dictated by the alert type.

Changing the policy, the conditions of an enabled type, or the rule's state takes effect on the next scheduled evaluation.

Deleting a rule

What's not here

No multi-scope rules. One rule, one scope. Use a Policy to route multiple rules to the same channels instead.
No conditional logic between types. Enabled types fire independently — there's no "fire cost_spike only if budget_threshold also fires" wiring.
No custom alert types. The four types in the rule editor are the four types.

Next steps

Policies — route a rule's incidents to one or more channels with a delivery mode and a daily cap.
Channels — configure the actual destinations.

Rules

What a Rule is

The fields

One rule, several types

Per-type conditions

Cost spike

Budget threshold

Unused resources

New expensive workload

Rule state

Previewing a rule before saving

Editing a rule

Deleting a rule

What's not here

Next steps

Related

Rules

What a Rule is

The fields

One rule, several types

Per-type conditions

Cost spike

Budget threshold

Unused resources

New expensive workload

Rule state

Previewing a rule before saving

Editing a rule

Deleting a rule

What's not here

Next steps

Related