Slide 1

Slide 1 text

WHEN WORST IS BEST Peter Bailis Stanford CS @pbailis in distributed systems design StrangeLoop 2015 25 September, St. Louis

Slide 2

Slide 2 text

What if we designed computer systems for worst case scenarios?

Slide 3

Slide 3 text

Cluster provisioning: 7.3B simultaneous users many idle resources! What if we designed computer systems for worst case scenarios?

Slide 4

Slide 4 text

Cluster provisioning: 7.3B simultaneous users many idle resources! Hardware: chips for the next Mars rover hugely expensive packaging! What if we designed computer systems for worst case scenarios?

Slide 5

Slide 5 text

Cluster provisioning: 7.3B simultaneous users many idle resources! Hardware: chips for the next Mars rover hugely expensive packaging! Security: all our developers are malicious expensive code deployment! What if we designed computer systems for worst case scenarios?

Slide 6

Slide 6 text

Designing for the worst case often penalizes the average case

Slide 7

Slide 7 text

Designing for the worst case often penalizes the average case Average case performance Worst case performance

Slide 8

Slide 8 text

Designing for the worst case often penalizes the average case Average case performance Worst case performance ??? this talk

Slide 9

Slide 9 text

This talk: When can designing for the worst case improve the average case? Structure Distributed systems and the network Beyond the network Lessons

Slide 10

Slide 10 text

This talk: When can designing for the worst case improve the average case? Structure Distributed systems and the network Beyond the network Lessons

Slide 11

Slide 11 text

Almost every non-trivial application today is (or is becoming) distributed Distributed Systems Matter Distribution happens over a network

Slide 12

Slide 12 text

Almost every non-trivial application today is (or is becoming) distributed Corollary: Almost every non-trivial application today needs to worry about the network Distributed Systems Matter Distribution happens over a network

Slide 13

Slide 13 text

Networks make design hard Many things can go wrong:

Slide 14

Slide 14 text

Networks make design hard Many things can go wrong: Packets may be delayed Packets may be dropped Sometimes called an asynchronous network

Slide 15

Slide 15 text

any replica can respond to any request Handling Worst-Case Net Behavior availability addresses delays, drops:

Slide 16

Slide 16 text

any replica can respond to any request Handling Worst-Case Net Behavior availability addresses delays, drops:

Slide 17

Slide 17 text

any replica can respond to any request Handling Worst-Case Net Behavior availability addresses delays, drops:

Slide 18

Slide 18 text

any replica can respond to any request Handling Worst-Case Net Behavior availability addresses delays, drops:

Slide 19

Slide 19 text

any replica can respond to any request Handling Worst-Case Net Behavior availability addresses delays, drops:

Slide 20

Slide 20 text

any replica can respond to any request Handling Worst-Case Net Behavior availability addresses delays, drops:

Slide 21

Slide 21 text

any replica can respond to any request Handling Worst-Case Net Behavior if our system is available, then even when network is fine, we still don’t have to talk! availability addresses delays, drops:

Slide 22

Slide 22 text

any replica can respond to any request Handling Worst-Case Net Behavior if our system is available, then even when network is fine, we still don’t have to talk! NO COORDINATION availability addresses delays, drops:

Slide 23

Slide 23 text

Coordination-free systems What if we don’t have to talk?

Slide 24

Slide 24 text

Coordination-free systems: 1.) Enable infinite scale-out What if we don’t have to talk?

Slide 25

Slide 25 text

Coordination-free systems: 1.) Enable infinite scale-out What if we don’t have to talk?

Slide 26

Slide 26 text

Coordination-free systems: 1.) Enable infinite scale-out What if we don’t have to talk?

Slide 27

Slide 27 text

Coordination-free systems: 1.) Enable infinite scale-out What if we don’t have to talk?

Slide 28

Slide 28 text

Coordination-free systems: 1.) Enable infinite scale-out What if we don’t have to talk?

Slide 29

Slide 29 text

A B C D E F G H DISTRIBUTED TRANSACTIONS (EC2) 1 2 3 4 5 6 7 Number of Items per Transaction Throughput (txns/s) Number of Servers (Items) Accessed per Transaction Number of Servers (Items) Accessed per Transaction

Slide 30

Slide 30 text

A B C D E F G H IN-MEMORY LOCKING COORDINATED 1 2 3 4 5 6 7 Number of Items per Transaction Throughput (txns/s) DISTRIBUTED TRANSACTIONS (EC2) Number of Servers (Items) Accessed per Transaction Number of Servers (Items) Accessed per Transaction

Slide 31

Slide 31 text

A B C D E F G H IN-MEMORY LOCKING COORDINATED 1 2 3 4 5 6 7 Number of Items per Transaction Throughput (txns/s) DISTRIBUTED TRANSACTIONS (EC2) LOG SCALE! -398x Number of Servers (Items) Accessed per Transaction Number of Servers (Items) Accessed per Transaction

Slide 32

Slide 32 text

A B C D E F G H IN-MEMORY LOCKING 1 2 3 4 5 6 7 Number of Items per Transaction Throughput (txns/s) COORDINATED COORDINATION-FREE DISTRIBUTED TRANSACTIONS (EC2) -398x Number of Servers (Items) Accessed per Transaction

Slide 33

Slide 33 text

Coordination-free systems: 1.) Enable infinite scale-out 2.) Improve throughput What if we don’t have to talk?

Slide 34

Slide 34 text

133.7+ ms RTT

Slide 35

Slide 35 text

133.7+ ms RTT

Slide 36

Slide 36 text

133.7+ ms RTT 85.1+ ms RTT

Slide 37

Slide 37 text

What if we don’t have to talk? Coordination-free systems: 1.) Enable infinite scale-out 2.) Improve throughput 3.) Ensure low latency 4.) Guarantee “always on" response

Slide 38

Slide 38 text

What if we don’t have to talk? Coordination-free systems: 1.) Enable infinite scale-out 2.) Improve throughput 3.) Ensure low latency 4.) Guarantee “always on" response

Slide 39

Slide 39 text

Coordination-free systems: 1.) Enable infinite scale-out 2.) Improve throughput 3.) Ensure low latency 4.) Guarantee “always on" response What if we don’t have to talk?

Slide 40

Slide 40 text

But wait! What about CAP?!?! • CAP Thm.: Famous result from Eric Brewer, Inktomi • Takeaway (+ related results): properties like serializability require unavailability (or require coordination) • Common (incorrect) conclusion: availability is too expensive, only matters during failures, so forget about it

Slide 41

Slide 41 text

But wait! What about CAP?!?! • CAP Thm.: Famous result from Eric Brewer, Inktomi • Takeaway (+ related results): properties like serializability require unavailability (or require coordination) • Common (incorrect) conclusion: availability is too expensive, only matters during failures, so forget about it surprise: many useful guarantees don’t require coordination (or unavailability)!

Slide 42

Slide 42 text

“Worst” is a Design Tool legacy implementations: designed for single- node context, use coordination research question: what if we built systems that didn’t have to coordinate? result: new designs that avoid coordination unless strictly necessary Example: Coordination-Avoiding Databases

Slide 43

Slide 43 text

Simple Example: Read Committed legacy implementation: lock records during access research question: is coordination necessary? goal: never read from uncommitted transactions

Slide 44

Slide 44 text

Simple Example: Read Committed legacy implementation: lock records during access research question: is coordination necessary? result: no! for example, buffer writes until commit result: OOM speedups over classic implementations goal: never read from uncommitted transactions VLDB 2014, SIGMOD 2015

Slide 45

Slide 45 text

What if we don’t have to talk? Coordination-free systems: 1.) Enable infinite scale-out 2.) Improve throughput 3.) Ensure low latency 4.) Guarantee “always on" response

Slide 46

Slide 46 text

Coordination-free systems: 1.) Enable infinite scale-out 2.) Improve throughput 3.) Ensure low latency 4.) Guarantee “always on" response What if we don’t have to talk?

Slide 47

Slide 47 text

Coordination-free systems: 1.) Enable infinite scale-out 2.) Improve throughput 3.) Ensure low latency 4.) Guarantee “always on" response What if we don’t have to talk? Accounting for worst case improves average case

Slide 48

Slide 48 text

Punchline: Distributed Systems & Networks • Systems that behave well during network faults can behave better in non-faulty environments too • With good designs, popular guarantees from today’s RDBMSs can benefit! (see also Martin’s talk, 11AM Sat) • Research on coordination-avoiding systems highlights potential for huge speedups (see bailis.org) • Keywords: CRDTs, I-confluence, RAMP, HAT, Bloom^L

Slide 49

Slide 49 text

This talk: When can designing for the worst case improve the average case? Structure Distributed systems and the network Beyond the network Lessons

Slide 50

Slide 50 text

Replication for fault tolerance can increase request capacity Replication helps Capacity

Slide 51

Slide 51 text

Fail-over helps (Dev)Ops

Slide 52

Slide 52 text

Fail-over helps (Dev)Ops

Slide 53

Slide 53 text

If services can auto-fail-over… can kill processes: to perform upgrades to manage stragglers to revoke resources Fail-over helps (Dev)Ops

Slide 54

Slide 54 text

99.9th %ile latency: 100ms avg latency: 1.2ms YOUR SERVICE HERE Tail Latency in (Micro)services

Slide 55

Slide 55 text

99.9th %ile latency: 100ms avg latency: 1.2ms YOUR SERVICE HERE 10ms Tail Latency in (Micro)services

Slide 56

Slide 56 text

99.9th %ile latency: 100ms avg latency: 1.2ms YOUR SERVICE HERE 10ms 1.09ms Tail Latency in (Micro)services

Slide 57

Slide 57 text

99.9th %ile latency: 100ms Tail Latency in (Micro)services

Slide 58

Slide 58 text

front-end avg. latency: 64ms at 100x fan-out, 99.9th %ile latency: 100ms Tail Latency in (Micro)services

Slide 59

Slide 59 text

front-end avg. latency: 64ms at 100x fan-out, 99.9th %ile latency: 100ms 10ms Tail Latency in (Micro)services

Slide 60

Slide 60 text

front-end avg. latency: 64ms at 100x fan-out, 6.7ms 99.9th %ile latency: 100ms 10ms Tail Latency in (Micro)services

Slide 61

Slide 61 text

YOUR SERVICE’S CORNER CASE MAY BE ITS CONSUMER’S AVERAGE CASE

Slide 62

Slide 62 text

Universal Design

Slide 63

Slide 63 text

Universal Design

Slide 64

Slide 64 text

No content

Slide 65

Slide 65 text

There is also a strong business case for accessibility. Accessibility overlaps with other best practices such as mobile web design, device independence, multi-modal interaction, usability, design for older users, and search engine optimization (SEO). Case studies show that accessible websites have better search results, reduced maintenance costs, and increased audience reach, among other benefits.

Slide 66

Slide 66 text

x f(x) When “Best” Is Brittle Idealized function Optimum

Slide 67

Slide 67 text

x f(x) When “Best” Is Brittle Idealized function Optimum Less well-behaved x f(x) Optimum

Slide 68

Slide 68 text

x f(x) When “Best” Is Brittle Idealized function Optimum Less well-behaved x f(x) Optimum Missed the target

Slide 69

Slide 69 text

x f(x) When “Best” Is Brittle Idealized function Optimum Less well-behaved x f(x) Optimum Missed the target “Stable” solution

Slide 70

Slide 70 text

x f(x) When “Best” Is Brittle Idealized function Optimum Less well-behaved x f(x) Optimum Missed the target “Stable” solution Robust Optimization studies finding the stable solution

Slide 71

Slide 71 text

This talk: When can designing for the worst case improve the average case? Structure Distributed systems and the network Beyond the network Lessons

Slide 72

Slide 72 text

This talk: When can designing for the worst case improve the average case?

Slide 73

Slide 73 text

When does this apply? When corner cases are common When environmental conditions are variable When “normal” isn’t what we think This talk: When can designing for the worst case improve the average case?

Slide 74

Slide 74 text

DEFINING “NORMAL” DEFINES OUR DESIGNS

Slide 75

Slide 75 text

“Worst” raises tough questions

Slide 76

Slide 76 text

Cluster provisioning: what’s our scale-out strategy? “Worst” raises tough questions

Slide 77

Slide 77 text

Cluster provisioning: what’s our scale-out strategy? Hardware: what happens during bit flips? do we need ECC? “Worst” raises tough questions

Slide 78

Slide 78 text

Cluster provisioning: what’s our scale-out strategy? Hardware: what happens during bit flips? do we need ECC? Security: how to do we manage internal data accesses? “Worst” raises tough questions

Slide 79

Slide 79 text

EXAMINE YOUR BIASES

Slide 80

Slide 80 text

Reasoning about worst-case scenarios can be a powerful design tool Key to coordination avoiding distributed systems designs Can often improve performance and robustness, also combat bias @PBAILIS // bailis.org

Slide 81

Slide 81 text

Special thanks to David Andersen, Ali Ghodsi, Joe Hellerstein, Eddie Kohler, Phil Levis, Alex Miller, Oscar Moll, Barzan Mozafari, Ion Stoica, Eugene Wu, Jean Yang, Matei Zaharia

Slide 82

Slide 82 text

Reasoning about worst-case scenarios can be a powerful design tool Key to coordination avoiding distributed systems designs Can often improve performance and robustness, also combat bias @PBAILIS // bailis.org