Domain 3: Alerting Strategy | SRE Maturity Rubric

0-6

Ad-hoc

7-12

Foundational

13-18

Standardized

19-24

Advanced

25-30

Optimized

Scoring Criteria by Level

Level	Criteria
1	Few alerts; mostly noisy; no runbooks; alert fatigue common
2	Basic alerts exist; high noise ratio; some documentation
3	SLO-based alerts; runbooks linked; regular tuning
4	Multi-window burn rates; <5% noise; automated tuning
5	Self-healing alerts; ML anomaly detection; proactive

Assessment Questions

#	Question	Max
1	What % of alerts are actionable?	6
2	How are alerts linked to runbooks?	6
3	How do you tune alert thresholds?	6
4	Do alerts correlate with SLO burn rates?	6
5	How do you manage alert escalation?	6

Focus Areas

Actionability: Every alert should have a clear action
SLO-Based: Alert on error budget burn, not thresholds
Runbooks: Documented response procedures
Tuning: Regular noise reduction reviews

Anti-Patterns (Red Flags)

Alerting on causes, not symptoms
>20% non-actionable alerts
No runbooks or outdated runbooks
Alert storms during incidents
Alerts ignored due to fatigue

Evidence Checklist

Alert actionability metrics tracked
Runbooks exist for all critical alerts
Alert noise ratio <20%
Multi-window burn rate alerts configured
Regular alert review cadence

Related Domains

Domain	Relationship
SLOs	Burn rate alerts derive from SLOs
Observability	Alerts query observability data
On-Call	Alert quality affects on-call health

Alert on Symptoms

Every page should require human action.