« Back to Index

[SLI, SLO, SLA]

View original Gist on GitHub

Tags: #SLI #SLO #SLA #Process #Service

SLI, SLO, SLA.md

SLI, SLO, SLA ?

When building a service, we have a responsibility to define some baseline agreements as to the service’s expected uptime and performance. This document focuses on the various terminology that we use to define these values.

Example

Below is an alternative example where we state (via a Datadog graph) that if the average request latency over a 1hr period exceeds one second, then we have a ‘service’ issue and it’ll display with a red background. This will signify that we’ve failed our SLA which is defined as being less than one second.

If the average request latency over a 1hr period is greater than half a second, then we have a ‘team’ issue and it’ll display with an orange background. This will signify that we’ve failed our SLO which is defined as being less than half a second.

If the average request latency over a 1hr period is less than half a second, then we have no issues and it’ll display with a green background. This will signify that we’ve reached our SLO which is defined as being less than half a second.