Building Resilient Frontend Systems (All Day Hey)

Building Resilient Frontend Systems (All Day Hey)

649e3d33ce29a5e6bb4ff3025c6aaffa?s=128

Ianfeather

April 30, 2018
Tweet

Transcript

  1. BUILDING RESILIENT FRONTEND SYSTEMS Ian Feather - BuzzFeed - @ianfeather

  2. None
  3. RESILIENCE IS FUNCTION IN A HOSTILE ENVIRONMENT

  4. UNDERSTAND YOUR TIERS OF USER EXPERIENCE

  5. GUARANTEE THE MOST BASIC LEVEL OF UX

  6. 1. HOW OUR SYSTEMS FAIL 2. DESIGNING FOR FAILURE 3.

    MITIGATING RISK 4. LEARNING FROM FAILURE
  7. HOW OUR SYSTEMS FAIL SECTION 1

  8. HOW OUR SYSTEMS FAIL 1. MALICIOUS INTERFERENCE

  9. HTTPS IS TABLE STAKES

  10. HTTPS IS TABLE STAKES

  11. HOW OUR SYSTEMS FAIL 1. MALICIOUS INTERFERENCE

  12. HOW OUR SYSTEMS FAIL 1. MALICIOUS INTERFERENCE 2. 3RD PARTY

    AVAILABILITY
  13. CONTROL YOUR POINTS OF FAILURE

  14. 2016

  15. 2016 DYN DNS 5 HRS

  16. 2016 DYN DNS 5 HRS AWS s3 9 HRS 2017

  17. 2016 DYN DNS 5 HRS AWS s3 9 HRS 2017

    Fastly CDN 1 HR
  18. 2016 DYN DNS 5 HRS AWS s3 9 HRS 2017

    Fastly CDN 1 HR AWS S3 2 hrs
  19. HOW OUR SYSTEMS FAIL 1. MALICIOUS INTERFERENCE 2. 3RD PARTY

    AVAILABILITY
  20. HOW OUR SYSTEMS FAIL 1. MALICIOUS INTERFERENCE 2. 3RD PARTY

    AVAILABILITY 3. DEVELOPER ERROR
  21. ADD SLIDE ABOUT SENTRY

  22. SLACK ALERTS

  23. KNOWING IT’S BROKEN BEFORE TWITTER DOES

  24. THEORY VS PRACTICE

  25. HOW OUR SYSTEMS FAIL 1. MALICIOUS INTERFERENCE 2. 3RD PARTY

    AVAILABILITY 3. DEVELOPER ERROR
  26. HOW OUR SYSTEMS FAIL 1. MALICIOUS INTERFERENCE 2. 3RD PARTY

    AVAILABILITY 3. DEVELOPER ERROR 4. THE NETWORK
  27. THEORY VS PRACTICE

  28. THEORY VS PRACTICE

  29. ~1% OF REQUESTS FOR JAVASCRIPT WILL TIMEOUT

  30. 13 MILLION REQUESTS FOR JAVASCRIPT WILL TIMEOUT

  31. HOW OUR SYSTEMS FAIL 1. MALICIOUS INTERFERENCE 2. 3RD PARTY

    AVAILABILITY 3. DEVELOPER ERROR 4. THE NETWORK
  32. HOW OUR SYSTEMS FAIL 1. MALICIOUS INTERFERENCE 2. 3RD PARTY

    AVAILABILITY 3. DEVELOPER ERROR 4. THE NETWORK 5. USER’S PRIVILEGE
  33. ~9% OF OUR USERS USE SOME FORM OF CONTENT BLOCKER

  34. ~4% WON’T SUCCESSFULLY DOWNLOAD OUR FONTS

  35. 40 MILLION PAGEVIEWS PER MONTH

  36. HOW OUR SYSTEMS FAIL 1. MALICIOUS INTERFERENCE 2. 3RD PARTY

    AVAILABILITY 3. DEVELOPER ERROR 4. THE NETWORK 5. USER’S PRIVILEGE
  37. HOPE FOR THE BEST?

  38. DESIGN FOR FAILURE SECTION 2

  39. DESIGN FOR FAILURE 1. PRIORITIZE CRITICAL PARTS OF THE PAGE

  40. User FONTS html IMAGES DATA (xhr) IMAGES CSS JS IMAGES

  41. User FONTS html IMAGES DATA (xhr) IMAGES CSS JS IMAGES

    Images
  42. User FONTS html IMAGES DATA (xhr) IMAGES CSS JS IMAGES

    HTML
  43. None
  44. None
  45. None
  46. DESIGN FOR FAILURE 1. PRIORITIZE CRITICAL PARTS OF THE PAGE

  47. DESIGN FOR FAILURE 1. PRIORITIZE CRITICAL PARTS OF THE PAGE

    2. MAKE ERRORS A FIRST CLASS CITIZEN
  48. SOMETHING BROKE! SHOULD I TELL THEM?

  49. None
  50. IT BROKE. SHOULD I TELL THEM?

  51. None
  52. DESIGN FOR FAILURE 1. PRIORITIZE CRITICAL PARTS OF THE PAGE

    2. MAKE ERRORS A FIRST CLASS CITIZEN
  53. MITIGATE RISK SECTION 3

  54. MITIGATE RISK 1. LOCK YOUR RUNTIME DEPENDENCIES

  55. CONTROL YOUR POINTS OF FAILURE

  56. None
  57. MITIGATE RISK 1. LOCK YOUR RUNTIME DEPENDENCIES

  58. MITIGATE RISK 1. LOCK YOUR RUNTIME DEPENDENCIES 2. BUILD IN

    REDUNDANCY
  59. HAVE TWO OF EVERYTHING

  60. Asset SERVER 1

  61. Asset SERVER 1 www.asset-server-one.com/styles.css

  62. Asset SERVER 1 www.asset-server-one.com/styles.css

  63. ✖ Asset SERVER 1 www.asset-server-one.com/styles.css

  64. ✖ Asset SERVER 1 Asset SERVER 2 www.asset-server-one.com/styles.css

  65. ✖ Asset SERVER 1 Asset SERVER 2 www.asset-server-two.com/styles.css www.asset-server-one.com/styles.css

  66. ✖ Asset SERVER 1 Asset SERVER 2 www.asset-server-two.com/styles.css www.asset-server-one.com/styles.css

  67. Asset SERVER 1 Asset SERVER 2 Proxy service

  68. Asset SERVER 1 Asset SERVER 2 www.asset-server.com/styles.css Proxy service

  69. Asset SERVER 1 Asset SERVER 2 www.asset-server.com/styles.css Proxy service

  70. Asset SERVER 1 Asset SERVER 2 www.asset-server.com/styles.css Proxy service

  71. PLAN Z

  72. MITIGATE RISK 1. LOCK YOUR RUNTIME DEPENDENCIES 2. BUILD IN

    REDUNDANCY
  73. MITIGATE RISK 1. LOCK YOUR RUNTIME DEPENDENCIES 2. BUILD IN

    REDUNDANCY 3. SERVE STALE CONTENT
  74. SERVER

  75. SERVER CDN

  76. SERVER CDN

  77. SERVER CDN

  78. SERVER CDN

  79. SERVER CDN

  80. SERVER CDN

  81. CDN SERVER

  82. CDN ✖ SERVER

  83. CDN ✖ SERVICE WORKER SERVER

  84. CDN ✖ SERVICE WORKER SERVER

  85. MITIGATE RISK 1. LOCK YOUR RUNTIME DEPENDENCIES 2. BUILD IN

    REDUNDANCY 3. SERVE STALE CONTENT
  86. LEARN FROM MISTAKES SECTION 4

  87. LEARN FROM MISTAKES 1. POSTMORTEMS

  88. BLAMELESS

  89. HOW DID WE HANDLE IT AS A TEAM?

  90. HOW COULD IT HAVE BEEN PREVENTED?

  91. LEARN FROM MISTAKES 1. POSTMORTEMS

  92. LEARN FROM MISTAKES 1. POSTMORTEMS 2. FIRE DRILLS & CHAOS

    TESTING
  93. FIRE DRILLS ARE A SAFE SPACE TO PRACTICE

  94. 1. LIMIT IMPACT 2. BE DECISIVE 3. DELEGATE EARLY

  95. CHAOS TESTING

  96. DELIBERATELY INTRODUCE FAILURE TO ENSURE YOUR SYSTEMS ARE RESILIENT

  97. LEARN FROM MISTAKES 1. POSTMORTEMS 2. FIRE DRILLS & CHAOS

    TESTING
  98. IN SUMMARY

  99. KNOW WHAT’S IMPORTANT TO YOUR USERS

  100. IDENTIFY HOW YOUR SYSTEM WILL DEGRADE

  101. IDENTIFY POINTS OF FAILURE AND BUILD IN FAIL-SAFES

  102. LEARN FROM EVERY FAILURE

  103. THANK YOU IAN FEATHER - BUZZFEED - @IANFEATHER