What’s in it for you?
How the world views SLOs?
How deep do they go?
How do we measure them?
- What do we leave out?
- How that which we don’t notice botches the numbers?
What should you do about it?
3
Slide 4
Slide 4 text
SLO
Service Level Objectives
4
Slide 5
Slide 5 text
SLI
Service Level Objectives
5
Slide 6
Slide 6 text
Service Levels
6
Slide 7
Slide 7 text
Availability
7
Slide 8
Slide 8 text
Speed
8
Slide 9
Slide 9 text
Correctness
9
Slide 10
Slide 10 text
Freshness
10
Slide 11
Slide 11 text
How to Measure a level
11
Slide 12
Slide 12 text
Availability
hours
12
Slide 13
Slide 13 text
Speed
mbps? ms?
13
Slide 14
Slide 14 text
Correctness
?
14
Slide 15
Slide 15 text
Freshness
?
15
Slide 16
Slide 16 text
SLI
Service Level Objectives
16
Slide 17
Slide 17 text
SLO
Service Level Objectives
17
Slide 18
Slide 18 text
In a given week/day/month
SLI X should be ___
SLO should be > < = ___
SLA or else ….
18
Slide 19
Slide 19 text
Uptime
19
Slide 20
Slide 20 text
Is this 3 9s?
20
Slide 21
Slide 21 text
21
9s Per Day Per Week Per Month Per Year
99 14.4 mins 1.7 hours 7.3 hours 3.7 days
99.9 1.4 mins 10.1 mins 43.8 mins 8.7 hours
99.99 8.6 secs 1 min 4.4 mins 52.6 mins
99.999 0.864 sec 6.1 min 26.3 secs 5.3 mins
Slide 22
Slide 22 text
Window
22
Slide 23
Slide 23 text
To measure uptime
Total - Down
23
Slide 24
Slide 24 text
Downtime
24
Slide 25
Slide 25 text
How long was I asleep?
Try answering without external observation
25
Slide 26
Slide 26 text
Device SDK
26
Slide 27
Slide 27 text
Metrics
emission
27
Slide 28
Slide 28 text
99.9 / week
~ 10 mins
= 1.4 mins/day
28
Slide 29
Slide 29 text
29
Slide 30
Slide 30 text
30
There are 10K devices.
avg 500 ms to reach
1.4 minutes to report AND fix
Slide 31
Slide 31 text
1% devices SDKs experience an
ISP fault
99.9% is 99% true
31
Slide 32
Slide 32 text
60% of the time, it
works every time
[Anchorman]
32