Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Etsy: A learning culture in practice

Etsy: A learning culture in practice

Ea4344b9c06094807b3d1171d2363488?s=128

Nishan Subedi

October 26, 2017
Tweet

Transcript

  1. A learning culture in practice Nishan Subedi

  2. Agenda Introduction What is a learning culture Modeling socio-technical systems

    Human factors Practices @ Etsy Resources
  3. Introduction 3

  4. Who am I? Sr. Machine Learning Engineer on the Search

    Ranking Team: • Been at the company > 3.5 years • Been a PostMortem Facilitator for > 2 years • Teach PostMortem Facilitation Course 4
  5. Etsy is a global marketplace where people around the world

    connect, both online and offline, to make, sell and buy unique goods. 5
  6. By The Numbers 6 1.8M active sellers AS OF 2017

    30.6M active buyers AS OF 2017 $2.84B annual GMS IN 2016 45M items for sale AS OF 2017 Photo by Kirsty-Lyn Jameson
  7. We are always deploying 7

  8. LEARNING CULTURE

  9. “Failure is success if we learn from it.” - Malcolm

    Forbes 9
  10. LEARNING CULTURE

  11. Culture is a shared set of beliefs, behaviors, and routines.

    11
  12. Culture is to a group what personality or character is

    to an individual. It’s constantly evolving. 12
  13. • A strong culture can overcome almost any set of

    poor technical decisions. •A weak culture can’t be saved by using the best technology. •Culture is reinforced by, and reinforces your tooling and process. Why focus on culture? 13
  14. Failure for etsy.com 14 From: etsystatus.com

  15. Why let a good outage go to a waste? 15

  16. Event Investigation a.k.a PostMortems 16 https://github.com/etsy/morgue

  17. Event Investigation a.k.a PostMortems 17 https://github.com/etsy/morgue

  18. Event Investigation Survey 2017 18 HTTPS://GITHUB.COM/ETSY/MORGUE

  19. Get a full and honest picture of what happened, and

    the steps needed to help prevent it from happening again. - Anonymous Response 19
  20. Including a diverse group of people with different perspectives on

    the issue. - Anonymous Response 20
  21. Creating an environment where we can learn from an incidents

    in a blameless manner. - Anonymous Response 21
  22. Conditions for maximizing learning from PostMortems • Blameless • Open

    meetings • Everyone is invited: default to @tech-all • Accountability • Remediation • Better understanding of our socio-technical systems 22
  23. All models are wrong but some are useful. - George

    P. Box 23
  24. ROBUST UNPREDICTABLE NO CLEAR CAUSALITY DRIFT TO DEGRADATION HUMANS AS

    A SOURCE OF ADAPTABILITY MODELING OUR SYSTEMS AS COMPLEX SYSTEMS 24
  25. IMPLICATIONS OF COMPLEXITY 25

  26. HUMAN FACTORS 26

  27. Safety is the potential for the system to adapt and

    perform acceptably under widely varying conditions. Human variability provides this adaptive capacity. 27
  28. ETSY’s DEPLOY DASHBOARD 28

  29. WE SIMPLIFY, UNAWARE OF OUR BIASES 29

  30. HINDSIGHT BIAS BIASES 30

  31. OUTCOME BIAS BIASES 31

  32. CONFIRMATION BIAS BIASES 32

  33. Goals 33

  34. Sensemaking is not about truth and getting it right. Instead,

    it is about continued redrafting of an emerging narrative so that it becomes more comprehensive, incorporates more of the observed data, and is more resilient in the face of criticism. Ongoing semsemaking in PostMortems 34
  35. WHO DO I BLAME? WHAT KIND OF ACCOUNTABILTY DO YOU

    WANT? 35
  36. WHO IS ACCOUNTABLE FOR IMPLEMENTING CHANGES TO MAKE THINGS BETTER?

    WHAT KIND OF ACCOUNTABILTY DO YOU WANT? 36
  37. BLAME-FREE ≠ ACCOUNTABILITY-FREE

  38. Architecture Review Operability Review Have you tried Pre-Mortems? 38

  39. Morgue: http://github.com/etsy/morgue PostMortem Facilitation Guide: https://extfiles.etsy.com/DebriefingFacilitationGuide.pdf Etsy’s Engineering Blog: http://codeascraft.com

    Talks @ Etsy: http://etsy.com/codeascraft/talks We’re hiring! http://etsy.com/careers Resources 39
  40. References 40 Blameless PostMortems and a Just Culture: https://codeascraft.com/2012/05/22/blameless- postmortems/

    Cook, Richard I. "How complex systems fail." Cognitive Technologies Laboratory, University of Chicago. Chicago IL (1998) Weick, Karl E. Sensemaking in organizations. Vol. 3. Sage, 1995. Tversky, Amos and Kahneman, Daniel. “Judgment under Uncertainty: Heuristics and Biases.” Science, September 1974, 185(4157), pp. 1124–31. ‘Life After Human Error’ Steven Shorrock, Velocity 2014 https://www.youtube.com/watch? v=STU3Or6ZU60 Rasmussen, Jens. "Risk management in a dynamic society: a modelling problem." Safety science 27.2 (1997): 183-213 ‘Revisiting the Swiss Cheese Model of Accidents’, J. Reason, E. Hollnagel, J. Paries Eurocontrol Oct 2006 Sidney Dekker. 2006. The Field Guide to Understanding Human Error. Ashgate Publishing Company, Brookfield, VT, USA.
  41. Thankyou! Find me @ Challenge Your Peers 7 41