Circuit Breakers, Retries, Timeouts, Bulkheads
SRE Bot | Resilience | Max 30 Points
| Level | Criteria |
|---|---|
| 1 | No defensive patterns; cascading failures common |
| 2 | Basic timeouts in some services; retry logic ad-hoc |
| 3 | Circuit breakers for critical paths; standardized timeouts |
| 4 | Bulkheads, load shedding; graceful degradation |
| 5 | Adaptive patterns; self-healing; antifragile design |
| # | Question | Max |
|---|---|---|
| 1 | How do you implement circuit breakers? | 6 |
| 2 | How standardized are timeouts/retries? | 6 |
| 3 | Do you use bulkhead isolation? | 6 |
| 4 | How do you handle graceful degradation? | 6 |
| 5 | How do you prevent cascading failures? | 6 |
| Domain | Relationship |
|---|---|
| Chaos Eng | Test patterns via chaos experiments |
| Dependencies | Patterns protect from dep failures |
| Capacity | Load shedding prevents overload |
Design for Failure
Assume everything will fail.