Production Engineering

Site Reliability Engineering (SRE)

SRE blends software engineering with operations to achieve reliability at scale.
0 Cohorts
4 Active this week
0 Resources
Individually selected
Flexible Schedule
Invest 20 minutes a day
SLOs and error budgets, incident response, observability/APM, chaos engineering, and capacity planning to keep systems reliable at scale. This track teaches how to define SLOs and error budgets, instrument and observe systems, run effective on-call, and use chaos and capacity practices to prevent outages. Build resilient services while maintaining delivery speed.

Target Audience

SREs, platform/ops engineers, backend engineers, tech leads, engineering managers, incident commanders.

Domains in this track

Chaos Engineering

Upcoming Events
Programs