Slide 1

Slide 1 text

1/11/16 @smithclay CLAY SMITH FORWARDJS 2014 JULY 25, 2014 Embracing failure on the front-end

Slide 2

Slide 2 text

1/11/16 What this talk covers NOT COVERED: MY RECIPE FOR TEXAS-STYLE BEEF CHILI. FIND ME AFTER TO TALK ABOUT IT. The inevitability that Javascript apps will break. Borrowing good ideas about failure from operations teams. A bit about the theory of complex systems failure. Open-source tools and services that help make apps more resilient. Why talking about failure in the front-end is important.

Slide 3

Slide 3 text

1/11/16 GOOGLE TRENDS ALL THE THINGS One trend is twice as popular as the other trend on average.

Slide 4

Slide 4 text

1/11/16 DR. COOK IS MY HERO RICHARD I. COOK, MD. HOW COMPLEX SYSTEMS FAIL. “Complex systems are intrinsically hazardous systems.” SOME THEORY, PART 1

Slide 5

Slide 5 text

1/11/16 “Exception” tracking with window.onerror MAY YOU NEVER HAVE TO SEE THIS DIALOG AGAIN DANGER: THIS GETS PRETTY UGLY.

Slide 6

Slide 6 text

1/11/16 So you want to use a 3rd party service… SERIOUSLY, PAUL IRISH APPEARS IN ALL MY TALKS. THERE ARE LOTS: HTTPS://PLUS.GOOGLE.COM/+PAULIRISH/POSTS/12BVL5EXFJN

Slide 7

Slide 7 text

1/11/16 NS_TOO_MUCH_NOISE. NOT REALLY SURE WHY I REDACTED THE URLS. FURTHER READING: HTTP://BLOG.MELDIUM.COM/HOME/2013/9/30/SO-YOURE-THINKING-OF-TRACKING-YOUR-JS-ERRORS Example window.onerror output

Slide 8

Slide 8 text

1/11/16 DOES THIS SOUND LIKE COMMON SENSE YET? "Change introduces new forms of failure." RICHARD I. COOK, MD. HOW COMPLEX SYSTEMS FAIL. SOME THEORY, PART 2

Slide 9

Slide 9 text

1/11/16 Monitor change with phantomas CREEPY PICTURE, NO? I BET HE WRITES ERLANG. I ALSO DON'T KNOW HOW TO SAY PHANTOMAS. HTTPS://GITHUB.COM/MACBRE/PHANTOMAS JEAN MARAIS AS FANTÔMAS IN THE 1964 FILM. Phantomas is “PhantomJS-based web performance metrics collector and monitoring tool”. phantomas --cookie '_session=' --reporter=statsd --statsd-host 127.0.0.1 --statsd-prefix stg --runs 5 http://staging-web.com

Slide 10

Slide 10 text

1/11/16 How to get super-detailed site metrics… if you’re lazy and cheap. 5 HABITS OF HIGHLY LAZY FRONT-END PERFORMANCE ENGINEERS Cloud server/your laptop with phantomas installed Cron job that runs phantomas with statsd output DataDog Lite Account + Install DataDog Agent on Server Configure Alerting (I recommend PagerDuty) Get woken up at 3am

Slide 11

Slide 11 text

1/11/16 Make the metrics understandable and actionable THIS LOOKS IMPRESSIVE WHILE YOU READ HACKER NEWS ON YOUR OTHER MONITOR TESTING DASHBOARD FOR STAGING ENVIRONMENT IN DATADOG. EVEN FANCIER: INTEGRATE IT INTO YOUR WEB APP: HTTPS://GITHUB.COM/BLOG/1252-HOW-WE-KEEP-GITHUB-FAST

Slide 12

Slide 12 text

1/11/16 Get alerted as things happen YOU'LL BE ANGRY AT ME WHEN THIS WAKES YOU UP AT 3AM CREATING A NEW METRIC ALERT IN DATADOG Choose a phantomas metric Define conditions

Slide 13

Slide 13 text

1/11/16 SAY THIS THE NEXT TIME YOU BLOW SOMETHING UP. “Failure free operations require experience with failure.” RICHARD I. COOK, MD. HOW COMPLEX SYSTEMS FAIL. See also: https://blog.pagerduty.com/2013/11/failure-friday-at-pagerduty/ SOME THEORY, PART 3

Slide 14

Slide 14 text

1/11/16 Inject chaos into your front-end ORIGINAL GRAPHIC SLIGHTLY REDACTED HTTPS://GITHUB.COM/TRAVIS-HILTERBRAND/CHAOS-MONKEY-BROWSER HTTPS://GITHUB.COM/MIKL/NODE-CHAOS-MONKEYWARE

Slide 15

Slide 15 text

1/11/16 EMBRACING FAILURE ON THE FRONT-END var props = { probability:0.5, allowedMethods:['GET'], mischiefTypes:[ ChaosMonkey.MischiefTypes.delay, ChaosMonkey.MischiefTypes.http403 ] }; ChaosMonkey(props); CONFIGURING CHAOS-MONKEY- BROWSER (*JQUERY REQUIRED) With a 50% probability, this configuration will cause jQuery ajax GET requests to slowly fail with a 403 response. CDN Failure API Failure Connection Failure Bad SSL certificates And more! Prepares for:

Slide 16

Slide 16 text

1/11/16 Other possible strategies HOW TO ANNOY PEOPLE DURING CODE REVIEW 1. DISABLE/SLOW DOWN NETWORK CONNECTION (IN CHROME CANARY DEVTOOLS): 2. WHAT HAPPENS WHEN YOU DISABLE JS? (USING PLUGIN RECOMMENDED): AMAZON.COM ISN’T HAPPY WITHOUT JAVASCRIPT

Slide 17

Slide 17 text

1/11/16 Lessons learned in failure SERIOUSLY, REMEMBER ONE OF THESE THINGS Measure errors and key performance metrics over time Bad performance = failure Annoy yourself to fix the broken things with alerting Find remediation steps to make sure it doesn’t happen again Get experience with failure before 7pm on a Friday

Slide 18

Slide 18 text

1/11/16 Thanks! @smithclay [email protected] Additional resources (more reading): • https://info.aiaa.org/tac/SMG/SOSTC/Shared%20Documents/How%20Complex%20Systems%20Fail.pdf • http://blog.meldium.com/home/2013/9/30/so-youre-thinking-of-tracking-your-js-errors • https://blog.pagerduty.com/2013/11/failure-friday-at-pagerduty/