Domain 9: Toil & Automation

Toil Reduction, Self-Service, Infrastructure as Code

SRE Bot | Release | Max 30 Points

0-6
Ad-hoc
7-12
Foundational
13-18
Standardized
19-24
Advanced
25-30
Optimized

Scoring Criteria by Level

LevelCriteria
1>50% toil; manual everything; ticket-driven ops
2Some automation; toil not measured; ad-hoc scripts
3Toil <50%; IaC for infra; some self-service
4Toil <30%; full IaC; developer self-service
5Toil minimal; platform engineering; autonomous ops

Assessment Questions

#QuestionMax
1What % of time is spent on toil?6
2How mature is your IaC?6
3What self-service exists for developers?6
4How do you track/prioritize toil reduction?6
5How automated are routine operations?6

Focus Areas

  • Toil: Manual, repetitive, automatable work
  • IaC: Infrastructure defined in code (Terraform, Pulumi)
  • Self-Service: Developer portals, golden paths
  • Automation: Script → tool → platform progression

Anti-Patterns (Red Flags)

  • Tickets for everything (ops as bottleneck)
  • ClickOps in production
  • Undocumented tribal knowledge
  • SRE team is ticket queue
  • No time allocated for automation

Evidence Checklist

  • Toil % tracked (<50% target)
  • Infrastructure managed via IaC
  • Self-service portal for common tasks
  • Automation backlog exists
  • Time explicitly allocated for automation

Related Domains

DomainRelationship
On-CallReduce pages via automation
Release EngCI/CD reduces deploy toil
DocumentationAutomate runbook execution

Automate the Boring

<50% toil, or push back.