Domain 1: SLOs & Error Budgets

Service Level Objectives and Error Budget Management

SRE Bot | Foundations | Max 30 Points

0-6
Ad-hoc
7-12
Foundational
13-18
Standardized
19-24
Advanced
25-30
Optimized

Scoring Criteria by Level

LevelCriteria
1No formal SLOs; availability discussed informally; no error budgets
2Basic SLOs for some services; not consistently tracked; no budget enforcement
3SLOs for critical services; error budgets calculated; basic burn rate monitoring
4Comprehensive SLOs; budgets enforced; dev slowdowns when budget exhausted
5SLOs drive all decisions; multi-window burn rates; automated freezes

Assessment Questions

#QuestionMax
1How well-defined are your SLIs?6
2How do you track/enforce error budgets?6
3How aligned are stakeholders on SLO targets?6
4What happens when error budget exhausted?6
5How do you review and iterate on SLOs?6

Focus Areas

  • SLI Definition: User-journey based indicators with clear measurement
  • SLO Targets: Realistic, stakeholder-aligned availability goals
  • Error Budget Policy: Clear consequences for budget violations
  • Stakeholder Alignment: Business and engineering co-ownership

Anti-Patterns (Red Flags)

  • Setting 100% availability targets (impossible, expensive)
  • SLOs without error budget policies
  • Engineering-only SLOs, no business alignment
  • No consequence for budget violations
  • Static SLOs that never evolve

Evidence Checklist

  • SLO documentation exists and is up-to-date
  • Error budget dashboards visible to stakeholders
  • Historical SLO compliance data available
  • Error budget policy with escalation process
  • Evidence of SLO-driven prioritization decisions

Related Domains

DomainRelationship
ObservabilitySLIs require metrics/logs infrastructure
AlertingBurn rate alerts drive incident response
Release EngError budgets gate feature releases

Error Budgets Enable Velocity

Managed risk, not zero risk.