Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Smoke & Mirrors: The Primitives of High Availability

Smoke & Mirrors: The Primitives of High Availability

As given at Mountain West Ruby Conference (MWRC) on March 9, 2015.

---

Many of the greatest achievements in the history of computers are based on lies, or rather, the strategic sets of lies we generallly call “abstraction”. Operating systems lie to programs about hardware, multitasking systems lie to users about parallelism, Ruby lies to us about how easy it is to tell a CPU what to do… the list goes on and on.
One of the primary “strategic lies” of the internet is the presentation of each service as though it were a discrete, cohesive entity. When we use GitHub, we think of it as just “GitHub”, not a swarm of networked computers. This lie gives us the opportunity to build high availability applications: apps designed to never go down.
Let’s take a tour through the amazing stack of tools that helps us construct high availability applications. We’ll review some of the incredible technology underlying the internet: things like TCP, BGP, and DNS. Then we’ll talk about how these primitives combine into useful patterns at the application level. I hope you’ll leave with not only a renewed appreciation for the core innovations of the internet, but also some practical working knowledge of how to go about building and running a zero-downtime application.

2fafdc19b0f7248e9a1e1e07d5a8b678?s=128

Paul Hinze

March 13, 2015
Tweet

Transcript

  1. Smoke and Mirrors The Primitives of High Availability

  2. paul hinze @phinze

  3. None
  4. abstraction “ ”

  5. abstrahere (v.) to draw away

  6. The internet Abstraction in Action

  7. None
  8. None
  9. 1 <!DOCTYPE html> 2 <html lang="en" class=""> 3 <head prefix="og:

    http://ogp.me/ns# fb: http://ogp.me/ns/fb# object: http://ogp.me/ns/object# article: http://ogp.me/ns/article# profile: http://ogp.me/ns/profile#"> 4 <meta charset='utf-8'> 5 <meta http-equiv="X-UA-Compatible" content="IE=edge"> 6 <meta http-equiv="Content-Language" content="en"> 7 8 9 <title>GitHub · Build software better, together.</title> 10 <link rel="search" type="application/opensearchdescription+xml" href="/opensearch.xml" title="GitHub"> 11 <link rel="fluid-icon" href="https://github.com/fluidicon.png" title="GitHub"> 12 <link rel="apple-touch-icon" sizes="57x57" href="/apple-touch-icon-114.png"> 13 <link rel="apple-touch-icon" sizes="114x114" href="/apple-touch-icon-114.png"> 14 <link rel="apple-touch-icon" sizes="72x72" href="/apple-touch-icon-144.png"> 15 <link rel="apple-touch-icon" sizes="144x144" href="/apple-touch-icon-144.png"> 16 <meta property="fb:app_id" content="1401488693436528"> 17 18 <meta property="og:url" content="https://github.com"> 19 <meta property="og:site_name" content="GitHub"> 20 <meta property="og:title" content="Build software better, together"> 21 <meta property="og:description" content="GitHub is the best place to build software together. Over 4 million people use GitHub to share code."> 22 <meta property="og:image" content="https://assets-cdn.github.com/images/modules/open_graph/github-logo.png"> 23 <meta property="og:image:type" content="image/png"> 24 <meta property="og:image:width" content="1200"> 25 <meta property="og:image:height" content="1200"> 26 <meta property="og:image" content="https://assets-cdn.github.com/images/modules/open_graph/github-mark.png"> 27 <meta property="og:image:type" content="image/png"> 28 <meta property="og:image:width" content="1200"> 29 <meta property="og:image:height" content="620"> 30 <meta property="og:image" content="https://assets-cdn.github.com/images/modules/open_graph/github-octocat.png"> 31 <meta property="og:image:type" content="image/png"> 32 <meta property="og:image:width" content="1200"> 33 <meta property="og:image:height" content="620"> 34 <meta property="twitter:site" content="github"> 35 <meta property="twitter:site:id" content="13334762"> 36 <meta property="twitter:creator" content="github"> 37 <meta property="twitter:creator:id" content="13334762"> 38 <meta property="twitter:card" content="summary_large_image"> 39 <meta property="twitter:title" content="GitHub"> 40 <meta property="twitter:description" content="GitHub is the best place to build software together. Over 4 million people use GitHub to share code."> 41 <meta property="twitter:image:src" content="https://assets-cdn.github.com/images/modules/open_graph/github-logo.png"> 42 <meta property="twitter:image:width" content="1200"> 43 <meta property="twitter:image:height" content="1200"> 44 <meta name="browser-stats-url" content="/_stats"> 45 <link rel="assets" href="https://assets-cdn.github.com/"> 46 <link rel="conduit-xhr" href="https://ghconduit.com:25035"> 47 48 <meta name="pjax-timeout" content="1000"> 49 50 51 <meta name="msapplication-TileImage" content="/windows-tile.png">
  10. 278 <li><a href="https://github.com/blog" data-ga-click="Footer, go to blog, text:blog">Blog</a></li> 279 <li><a

    href="https://github.com/about" data-ga-click="Footer, go to about, text:about">About</a></li> 280 281 </ul> 282 283 <a href="https://github.com" arial-label="Homepage"> 284 <span class="mega-octicon octicon-mark-github" title="GitHub"></span> 285 </a> 286 <ul class="site-footer-links"> 287 <li>&copy; 2015 <span title="0.00799s from github-fe135-cp1-prd.iad.github.net">GitHub</span>, Inc.</li> 288 <li><a href="https://github.com/site/terms" data-ga-click="Footer, go to terms, text:terms">Terms</a></li> 289 <li><a href="https://github.com/site/privacy" data-ga-click="Footer, go to privacy, text:privacy">Privacy</a></li> 290 <li><a href="https://github.com/security" data-ga-click="Footer, go to security, text:security">Security</a></li> 291 <li><a href="https://github.com/contact" data-ga-click="Footer, go to contact, text:contact">Contact</a></li> 292 </ul> 293 </div> 294 </div> 295 296 297 <div class="fullscreen-overlay js-fullscreen-overlay" id="fullscreen_overlay"> 298 <div class="fullscreen-container js-suggester-container"> 299 <div class="textarea-wrap"> 300 <textarea name="fullscreen-contents" id="fullscreen-contents" class="fullscreen-contents js-fullscreen-contents" placeholder=""></textarea> 301 <div class="suggester-container"> 302 <div class="suggester fullscreen-suggester js-suggester js-navigation-container"></div> 303 </div> 304 </div> 305 </div> 306 <div class="fullscreen-sidebar"> 307 <a href="#" class="exit-fullscreen js-exit-fullscreen tooltipped tooltipped-w" aria-label="Exit Zen Mode"> 308 <span class="mega-octicon octicon-screen-normal"></span> 309 </a> 310 <a href="#" class="theme-switcher js-theme-switcher tooltipped tooltipped-w" 311 aria-label="Switch themes"> 312 <span class="octicon octicon-color-mode"></span> 313 </a> 314 </div> 315 </div> 316 317 318 319 320 321 <div id="ajax-error-message" class="flash flash-error"> 322 <span class="octicon octicon-alert"></span> 323 <a href="#" class="octicon octicon-x flash-close js-ajax-error-dismiss" aria-label="Dismiss error"></a> 324 Something went wrong with that request. Please try again. 325 </div> 326 327 328 <script crossorigin="anonymous" src="https://assets-cdn.github.com/assets/frameworks-fd3bd2d0c854fa5baa64e8b390de48b1eff4b59e1f38d1b1d695c4b5d835ab04.js"></script> 329 <script async="async" crossorigin="anonymous" src="https://assets-cdn.github.com/assets/github-46628ff6533b28dfda2aeef282f8a3502316e88499a52a67ae0dd60479e3b950.js"></script> 330 331 332 333 </body> 334 </html> 335
  11. None
  12. github.com … ? 192.30.252.130 DNS Domain Name System

  13. GET / HTTP/1.1 HTTP/1.1 200 OK ( … ) HTTP

    Hypertext Transport Protocol 192.30.252.130
  14. GET / HTTP/1.1 HTTP/1.1 200 OK TCP Transmission Control Protocol

  15. TCP Transmission Control Protocol

  16. IP Internet Protocol

  17. mtr github.com

  18. 192.168.101.1 73.8.160.1 69.139.232.241 69.139.185.81 68.86.197.125 68.86.187.149 68.87.232.89 68.87.210.73 IP Internet

    Protocol 192.30.252.207 192.30.252.130 68.86.197.113 68.86.92.33 68.86.84.210 68.86.86.225 68.86.85.25 68.86.85.1 68.86.82.98 50.242.151.74 ?
  19. 192.30.252.130 1.1.0.0/16 C 1.2.0.0/16 B 1.3.0.0/16 A 1.4.0.0/16 C 1.5.0.0/16

    D 1.6.0.0/16 A 1.7.0.0/16 D 1.8.0.0/16 B 1.9.0.0/16 C 1.10.0.0/16 C 1.11.0.0/16 B 1.12.0.0/16 B 1.13.0.0/16 D IP Internet Protocol A B C D PLEASE SEND TO:
  20. BGP Border Gateway Protocol A B C D ANNOUNCE WITHDRAW

    UPDATE
  21. BGP Border Gateway Protocol

  22. IP Internet Protocol

  23. IP Internet Protocol

  24. TCP Transmission Control Protocol

  25. GET / HTTP/1.1 HTTP/1.1 200 OK HTTP Hypertext Transport Protocol

  26. None
  27. None
  28. The internet Abstraction in Action

  29. high AVAILABILITY

  30. Not Highly Available

  31. High Availability ≈ Fault Tolerance

  32. None
  33. Failure Happens

  34. Anticipate Prepare React

  35. What could fail? What should we do when it fails?

    What did we learn?
  36. Primitives

  37. Redundancy

  38. Treat many as one

  39. Hardware Components Fail

  40. LACP Link Aggregation Control Protocol RAID Redundant Array of Independent

    Disks
  41. Servers Fail

  42. Server Redundancy

  43. Transparent Proxy 192.30.252.128 10.0.0.11 10.0.0.12 10.0.0.13

  44. Load Balancing 100% 33% 33% 33%

  45. Load Balancing 100% 50% 50% 0%

  46. Heartbeat You OK? I’m OK! You OK? I’m OK! You

    OK? … timeout!
  47. Single Point of Failure SPOF

  48. github.com … ? 192.30.252.130 DNS Domain Name System 192.30.252.129 192.30.252.128

  49. github.com … ? 192.30.252.130 DNS Domain Name System 192.30.252.1 192.30.252.128

  50. 192.30.252.130 DNS Domain Name System 192.30.252.129 192.30.252.1 192.30.252.128 15m 192.30.252.128

    5m
  51. 192.30.252.130 DNS Domain Name System 192.30.252.129 192.30.252.1 192.30.252.128 15m 192.30.252.128

    5m
  52. Clustering

  53. Clustering 192.30.252.130

  54. Clustering 192.30.252.130 B A C

  55. Failover 192.30.252.130 B A C

  56. Timeouts and Retries

  57. None
  58. Managing State

  59. CAP Theorem

  60. Replication Synchronous Asynchronous

  61. Replication Synchronous Asynchronous +C -P +P -C

  62. Replication + Clustering + Automatic Failover + Load Balancing

  63. Monitoring

  64. Know Your Limits

  65. Disk Space Memory CPU I/O Network (Entropy)

  66. Anticipate Prepare React

  67. What happens when it fails?

  68. What’s the HA story for $TECH?

  69. SomeDay

  70. Abstraction Redundancy Load Balancing Heartbeats Clustering Automatic Failover Timeouts and

    Retries Replication Monitoring
  71. Perseverance Abstraction Redundancy Load Balancing Heartbeats Clustering Automatic Failover Timeouts

    and Retries Replication Monitoring
  72. thank you Maps from Free Vector Maps Other Graphics from

    The Noun Project by hunotika by Joe Harrison by Ben Rizzo by Yazmin Alanis by Ham Stanford by gira Park by NAMIRUS by Jamie Carrion by MikaDo Nguyen BGP Data from the BGP Instability Report