Upgrade to Pro — share decks privately, control downloads, hide ads and more …

When Worst is Best in Distributed Systems Design

pbailis
September 25, 2015

When Worst is Best in Distributed Systems Design

StrangeLoop 2015
25 September 2015
St. Louis, MO

Talk video: https://www.youtube.com/watch?v=ZGIAypUUwoQ
More information: http://bailis.org/

n many areas of systems design, provisioning for worst-case behavior (e.g., load spikes and anomalous user activity) incurs sizable penalties (e.g., performance and operational overheads) in the typical and best cases. However, in distributed systems, building software that is resilient to worse-case network behavior can -- perhaps paradoxically -- lead to improved behavior in typical and best-case scenarios. That is, systems that don't rely on synchronous communication (or coordination) in the worst case frequently aren't forced to wait in any case -- improving latency, scalability, and performance via increased concurrency.

In this talk, we'll explore how to use this worst-case analysis as a more general design principle for scalable systems design. As developers increasingly interacting with and building our own distributed systems, we tend to fixate only on failure scenarios (e.g., "partition tolerance" in the CAP Theorem); this is an important first step, but it's not the whole story. To illustrate why, I'll present practical lessons learned from applying this principle to both web and transaction processing applications as well as database internals such as integrity constraints and indexes. We've found considerable evidence that many of these common tasks and workloads can benefit substantially (e.g., regular order-of-magnitude speedups) from this analysis. In all likelihood, you can too.

pbailis

September 25, 2015
Tweet

More Decks by pbailis

Other Decks in Technology

Transcript

  1. WHEN WORST IS BEST Peter Bailis Stanford CS @pbailis in

    distributed systems design StrangeLoop 2015 25 September, St. Louis
  2. Cluster provisioning: 7.3B simultaneous users many idle resources! What if

    we designed computer systems for worst case scenarios?
  3. Cluster provisioning: 7.3B simultaneous users many idle resources! Hardware: chips

    for the next Mars rover hugely expensive packaging! What if we designed computer systems for worst case scenarios?
  4. Cluster provisioning: 7.3B simultaneous users many idle resources! Hardware: chips

    for the next Mars rover hugely expensive packaging! Security: all our developers are malicious expensive code deployment! What if we designed computer systems for worst case scenarios?
  5. Designing for the worst case often penalizes the average case

    Average case performance Worst case performance
  6. Designing for the worst case often penalizes the average case

    Average case performance Worst case performance ??? this talk
  7. This talk: When can designing for the worst case improve

    the average case? Structure Distributed systems and the network Beyond the network Lessons
  8. This talk: When can designing for the worst case improve

    the average case? Structure Distributed systems and the network Beyond the network Lessons
  9. Almost every non-trivial application today is (or is becoming) distributed

    Distributed Systems Matter Distribution happens over a network
  10. Almost every non-trivial application today is (or is becoming) distributed

    Corollary: Almost every non-trivial application today needs to worry about the network Distributed Systems Matter Distribution happens over a network
  11. Networks make design hard Many things can go wrong: Packets

    may be delayed Packets may be dropped Sometimes called an asynchronous network
  12. any replica can respond to any request Handling Worst-Case Net

    Behavior availability addresses delays, drops:
  13. any replica can respond to any request Handling Worst-Case Net

    Behavior availability addresses delays, drops:
  14. any replica can respond to any request Handling Worst-Case Net

    Behavior availability addresses delays, drops:
  15. any replica can respond to any request Handling Worst-Case Net

    Behavior availability addresses delays, drops:
  16. any replica can respond to any request Handling Worst-Case Net

    Behavior availability addresses delays, drops:
  17. any replica can respond to any request Handling Worst-Case Net

    Behavior availability addresses delays, drops:
  18. any replica can respond to any request Handling Worst-Case Net

    Behavior if our system is available, then even when network is fine, we still don’t have to talk! availability addresses delays, drops:
  19. any replica can respond to any request Handling Worst-Case Net

    Behavior if our system is available, then even when network is fine, we still don’t have to talk! NO COORDINATION availability addresses delays, drops:
  20. A B C D E F G H DISTRIBUTED TRANSACTIONS

    (EC2) 1 2 3 4 5 6 7 Number of Items per Transaction Throughput (txns/s) Number of Servers (Items) Accessed per Transaction Number of Servers (Items) Accessed per Transaction
  21. A B C D E F G H IN-MEMORY LOCKING

    COORDINATED 1 2 3 4 5 6 7 Number of Items per Transaction Throughput (txns/s) DISTRIBUTED TRANSACTIONS (EC2) Number of Servers (Items) Accessed per Transaction Number of Servers (Items) Accessed per Transaction
  22. A B C D E F G H IN-MEMORY LOCKING

    COORDINATED 1 2 3 4 5 6 7 Number of Items per Transaction Throughput (txns/s) DISTRIBUTED TRANSACTIONS (EC2) LOG SCALE! -398x Number of Servers (Items) Accessed per Transaction Number of Servers (Items) Accessed per Transaction
  23. A B C D E F G H IN-MEMORY LOCKING

    1 2 3 4 5 6 7 Number of Items per Transaction Throughput (txns/s) COORDINATED COORDINATION-FREE DISTRIBUTED TRANSACTIONS (EC2) -398x Number of Servers (Items) Accessed per Transaction
  24. What if we don’t have to talk? Coordination-free systems: 1.)

    Enable infinite scale-out 2.) Improve throughput 3.) Ensure low latency 4.) Guarantee “always on" response
  25. What if we don’t have to talk? Coordination-free systems: 1.)

    Enable infinite scale-out 2.) Improve throughput 3.) Ensure low latency 4.) Guarantee “always on" response
  26. Coordination-free systems: 1.) Enable infinite scale-out 2.) Improve throughput 3.)

    Ensure low latency 4.) Guarantee “always on" response What if we don’t have to talk?
  27. But wait! What about CAP?!?! • CAP Thm.: Famous result

    from Eric Brewer, Inktomi • Takeaway (+ related results): properties like serializability require unavailability (or require coordination) • Common (incorrect) conclusion: availability is too expensive, only matters during failures, so forget about it
  28. But wait! What about CAP?!?! • CAP Thm.: Famous result

    from Eric Brewer, Inktomi • Takeaway (+ related results): properties like serializability require unavailability (or require coordination) • Common (incorrect) conclusion: availability is too expensive, only matters during failures, so forget about it surprise: many useful guarantees don’t require coordination (or unavailability)!
  29. “Worst” is a Design Tool legacy implementations: designed for single-

    node context, use coordination research question: what if we built systems that didn’t have to coordinate? result: new designs that avoid coordination unless strictly necessary Example: Coordination-Avoiding Databases
  30. Simple Example: Read Committed legacy implementation: lock records during access

    research question: is coordination necessary? goal: never read from uncommitted transactions
  31. Simple Example: Read Committed legacy implementation: lock records during access

    research question: is coordination necessary? result: no! for example, buffer writes until commit result: OOM speedups over classic implementations goal: never read from uncommitted transactions VLDB 2014, SIGMOD 2015
  32. What if we don’t have to talk? Coordination-free systems: 1.)

    Enable infinite scale-out 2.) Improve throughput 3.) Ensure low latency 4.) Guarantee “always on" response
  33. Coordination-free systems: 1.) Enable infinite scale-out 2.) Improve throughput 3.)

    Ensure low latency 4.) Guarantee “always on" response What if we don’t have to talk?
  34. Coordination-free systems: 1.) Enable infinite scale-out 2.) Improve throughput 3.)

    Ensure low latency 4.) Guarantee “always on" response What if we don’t have to talk? Accounting for worst case improves average case
  35. Punchline: Distributed Systems & Networks • Systems that behave well

    during network faults can behave better in non-faulty environments too • With good designs, popular guarantees from today’s RDBMSs can benefit! (see also Martin’s talk, 11AM Sat) • Research on coordination-avoiding systems highlights potential for huge speedups (see bailis.org) • Keywords: CRDTs, I-confluence, RAMP, HAT, Bloom^L
  36. This talk: When can designing for the worst case improve

    the average case? Structure Distributed systems and the network Beyond the network Lessons
  37. If services can auto-fail-over… can kill processes: to perform upgrades

    to manage stragglers to revoke resources Fail-over helps (Dev)Ops
  38. 99.9th %ile latency: 100ms avg latency: 1.2ms YOUR SERVICE HERE

    10ms 1.09ms Tail Latency in (Micro)services
  39. front-end avg. latency: 64ms at 100x fan-out, 99.9th %ile latency:

    100ms 10ms Tail Latency in (Micro)services
  40. front-end avg. latency: 64ms at 100x fan-out, 6.7ms 99.9th %ile

    latency: 100ms 10ms Tail Latency in (Micro)services
  41. There is also a strong business case for accessibility. Accessibility

    overlaps with other best practices such as mobile web design, device independence, multi-modal interaction, usability, design for older users, and search engine optimization (SEO). Case studies show that accessible websites have better search results, reduced maintenance costs, and increased audience reach, among other benefits.
  42. x f(x) When “Best” Is Brittle Idealized function Optimum Less

    well-behaved x f(x) Optimum Missed the target
  43. x f(x) When “Best” Is Brittle Idealized function Optimum Less

    well-behaved x f(x) Optimum Missed the target “Stable” solution
  44. x f(x) When “Best” Is Brittle Idealized function Optimum Less

    well-behaved x f(x) Optimum Missed the target “Stable” solution Robust Optimization studies finding the stable solution
  45. This talk: When can designing for the worst case improve

    the average case? Structure Distributed systems and the network Beyond the network Lessons
  46. When does this apply? When corner cases are common When

    environmental conditions are variable When “normal” isn’t what we think This talk: When can designing for the worst case improve the average case?
  47. Cluster provisioning: what’s our scale-out strategy? Hardware: what happens during

    bit flips? do we need ECC? “Worst” raises tough questions
  48. Cluster provisioning: what’s our scale-out strategy? Hardware: what happens during

    bit flips? do we need ECC? Security: how to do we manage internal data accesses? “Worst” raises tough questions
  49. Reasoning about worst-case scenarios can be a powerful design tool

    Key to coordination avoiding distributed systems designs Can often improve performance and robustness, also combat bias @PBAILIS // bailis.org
  50. Special thanks to David Andersen, Ali Ghodsi, Joe Hellerstein, Eddie

    Kohler, Phil Levis, Alex Miller, Oscar Moll, Barzan Mozafari, Ion Stoica, Eugene Wu, Jean Yang, Matei Zaharia
  51. Reasoning about worst-case scenarios can be a powerful design tool

    Key to coordination avoiding distributed systems designs Can often improve performance and robustness, also combat bias @PBAILIS // bailis.org