UX Design and Education for Effective Monitoring Tools (Monitorama PDX 2017)

UX Design and Education for Effective Monitoring Tools (Monitorama PDX 2017)

Abstract:
I, like many of us, chose to work in infrastructure because I thought it meant I could avoid talking to other people as much as possible. I was wrong. It turns out that a huge amount of work in monitoring is about empowering the rest of your engineering organization to use the tools you develop correctly, quickly, and effectively. A huge amount of our time is spent explaining how to interpret timeseries data and why averaging percentiles is a bad idea. After feedback from our engineers, we embarked on a journey to redesign our internal monitoring tools and understand where people were struggling with the existing system.

In this talk, I will explain how we approached the problem of making concepts like interpolation, aggregation, and alerting more intuitive and how we identified pain points for new users. I will outline common misconceptions our users have about monitoring and how we cleared up this confusion in our UI without forcing everyone to spend hours on documentation. Rather than copying and pasting existing UX design principles onto our monitoring problems, we will see how we can reinterpret these ideas and apply them to our unique situation to create a better experience for everyone.

780e86312035da00762813aa2e443ae8?s=128

Amy Nguyen

May 24, 2017
Tweet

Transcript

  1. Amy Nguyen @amyngyn Monitorama PDX 2017 UX Design and Education

    for Effective Monitoring Tools
  2. Amy Nguyen @amyngyn Monitorama PDX 2017 Agenda - Background -

    Motivation - Sharing what you know - Designing what your users want - Recap
  3. Amy Nguyen @amyngyn Monitorama PDX 2017 Agenda - Background -

    Motivation - Sharing what you know - Designing what your users want - Recap
  4. Amy Nguyen @amyngyn Monitorama PDX 2017 Hi! - Joined Pinterest

    in 2015
  5. Amy Nguyen @amyngyn Monitorama PDX 2017 Hi! - Joined Pinterest

    in 2015 - Recent projects:
  6. Amy Nguyen @amyngyn Monitorama PDX 2017 Hi! - Joined Pinterest

    in 2015 - Recent projects: ◦ tracing for performance
  7. Amy Nguyen @amyngyn Monitorama PDX 2017 Hi! - Joined Pinterest

    in 2015 - Recent projects: ◦ tracing for performance ◦ D3 data visualizations
  8. Amy Nguyen @amyngyn Monitorama PDX 2017 Hi! - Joined Pinterest

    in 2015 - Recent projects: ◦ tracing for performance ◦ D3 data visualizations ◦ cache for OpenTSDB data
  9. Amy Nguyen @amyngyn Monitorama PDX 2017 Hi! - Joined Pinterest

    in 2015 - Recent projects: ◦ tracing for performance ◦ D3 data visualizations ◦ cache for OpenTSDB data ◦ so much documentation omg
  10. Amy Nguyen @amyngyn Monitorama PDX 2017 Hi! - Joined Pinterest

    in 2015 - Recent projects: ◦ tracing for performance ◦ D3 data visualizations ◦ cache for OpenTSDB data ◦ so much documentation omg - amynguyen.net @amyngyn pinterest.com/amyngyn
  11. Amy Nguyen @amyngyn Monitorama PDX 2017 About Pinterest the world's

    first visual discovery engine
  12. Amy Nguyen @amyngyn Monitorama PDX 2017 About Pinterest - 175

    million monthly active users the world's first visual discovery engine
  13. Amy Nguyen @amyngyn Monitorama PDX 2017 About Pinterest - 175

    million monthly active users - 100 billion pins the world's first visual discovery engine
  14. Amy Nguyen @amyngyn Monitorama PDX 2017 About Pinterest - 175

    million monthly active users - 100 billion pins - 2 billion boards the world's first visual discovery engine
  15. Amy Nguyen @amyngyn Monitorama PDX 2017 About Pinterest - 175

    million monthly active users - 100 billion pins - 2 billion boards - 2 billion searches every month the world's first visual discovery engine
  16. Amy Nguyen @amyngyn Monitorama PDX 2017 About Pinterest - 175

    million monthly active users - 100 billion pins - 2 billion boards - 2 billion searches every month - 150,000 requests served per second the world's first visual discovery engine
  17. Amy Nguyen @amyngyn Monitorama PDX 2017 About Pinterest Monitoring the

    world's first greatest monitoring team
  18. Amy Nguyen @amyngyn Monitorama PDX 2017 About Pinterest Monitoring -

    Graphite, OpenTSDB, Kafka, Storm, Spark, ELK, Sumo Logic, Zipkin the world's first greatest monitoring team
  19. Amy Nguyen @amyngyn Monitorama PDX 2017 About Pinterest Monitoring -

    Graphite, OpenTSDB, Kafka, Storm, Spark, ELK, Sumo Logic, Zipkin - 100 terabytes logged per day the world's first greatest monitoring team
  20. Amy Nguyen @amyngyn Monitorama PDX 2017 About Pinterest Monitoring -

    Graphite, OpenTSDB, Kafka, Storm, Spark, ELK, Sumo Logic, Zipkin - 100 terabytes logged per day - 2.5M metrics ingested per second the world's first greatest monitoring team
  21. Amy Nguyen @amyngyn Monitorama PDX 2017 About Pinterest Monitoring -

    Graphite, OpenTSDB, Kafka, Storm, Spark, ELK, Sumo Logic, Zipkin - 100 terabytes logged per day - 2.5M metrics ingested per second - Over 400 engineers the world's first greatest monitoring team
  22. Amy Nguyen @amyngyn Monitorama PDX 2017 About Pinterest Monitoring -

    Graphite, OpenTSDB, Kafka, Storm, Spark, ELK, Sumo Logic, Zipkin - 100 terabytes logged per day - 2.5M metrics ingested per second - Over 400 engineers customers! the world's first greatest monitoring team
  23. Amy Nguyen @amyngyn Monitorama PDX 2017 Our Tools: Dashboards

  24. Amy Nguyen @amyngyn Monitorama PDX 2017 Our Tools: Graph Exploration

  25. Amy Nguyen @amyngyn Monitorama PDX 2017 Our Tools: Alerting

  26. Amy Nguyen @amyngyn Monitorama PDX 2017 Agenda - Background -

    Motivation - Sharing what you know - Designing what your users want - Recap
  27. Amy Nguyen @amyngyn Monitorama PDX 2017 Why should we care

    about user experience?
  28. Amy Nguyen @amyngyn Monitorama PDX 2017

  29. Amy Nguyen @amyngyn Monitorama PDX 2017

  30. Amy Nguyen @amyngyn Monitorama PDX 2017

  31. Amy Nguyen @amyngyn Monitorama PDX 2017 Why should we care

    about user experience?
  32. Amy Nguyen @amyngyn Monitorama PDX 2017 Why should we care

    about user experience? - Prevent misunderstandings: not everyone is (or should have to be) an expert at interpreting monitoring data
  33. Amy Nguyen @amyngyn Monitorama PDX 2017 Why should we care

    about user experience? - Prevent misunderstandings: not everyone is (or should have to be) an expert at interpreting monitoring data - Developer velocity: help people reach conclusions faster, help your company move faster
  34. Amy Nguyen @amyngyn Monitorama PDX 2017 Why should we care

    about user experience? - Prevent misunderstandings: not everyone is (or should have to be) an expert at interpreting monitoring data - Developer velocity: help people reach conclusions faster, help your company move faster - Data democracy: you don't know what questions people want to answer with their own data
  35. Amy Nguyen @amyngyn Monitorama PDX 2017 "The fastest way to

    become a 10x engineer is to help 10 other engineers do their jobs better. - Wayne Gretzky" - Michael Scott
  36. Amy Nguyen @amyngyn Monitorama PDX 2017 Why should we care

    about user experience? - Prevent misunderstandings: not everyone is (or should have to be) an expert at interpreting timeseries data - Developer velocity: help people reach conclusions faster, help your company move faster - Data democracy: you don't know what questions people want to answer with their own data
  37. Amy Nguyen @amyngyn Monitorama PDX 2017 Why should we care

    about user experience? - Prevent misunderstandings: not everyone is (or should have to be) an expert at interpreting timeseries data - Developer velocity: help people reach conclusions faster, help your company move faster - Data democracy: you don't know what questions people want to answer with their own data
  38. Amy Nguyen @amyngyn Monitorama PDX 2017 Why should we care

    about user experience? Because we can help engineers work correctly, quickly, and independently.
  39. Amy Nguyen @amyngyn Monitorama PDX 2017 UX and Your Situation

  40. Amy Nguyen @amyngyn Monitorama PDX 2017 UX and Your Situation

    team
  41. Amy Nguyen @amyngyn Monitorama PDX 2017 UX and Your Situation

    team documentation
  42. Amy Nguyen @amyngyn Monitorama PDX 2017 UX and Your Situation

    team documentation tools
  43. Amy Nguyen @amyngyn Monitorama PDX 2017 UX and Your Situation

    team documentation tools probably paying for vendors TBH?
  44. Amy Nguyen @amyngyn Monitorama PDX 2017 UX and Your Situation

    team documentation tools probably paying for vendors TBH? this talk
  45. Amy Nguyen @amyngyn Monitorama PDX 2017 UX and Your Situation

    team documentation tools probably paying for vendors TBH? this talk things we can all control
  46. Amy Nguyen @amyngyn Monitorama PDX 2017 - Background - Motivation

    - Sharing what you know - Designing what your users want - Recap Agenda
  47. Amy Nguyen @amyngyn Monitorama PDX 2017 Sharing what you know

  48. Amy Nguyen @amyngyn Monitorama PDX 2017 Sharing what you know

    1. Education vs intuition: Don't overload people with too much information.
  49. Amy Nguyen @amyngyn Monitorama PDX 2017 will the wifi work?

    who knows?
  50. Amy Nguyen @amyngyn Monitorama PDX 2017 stats.example.metric.errors > 5

  51. Amy Nguyen @amyngyn Monitorama PDX 2017

  52. Amy Nguyen @amyngyn Monitorama PDX 2017

  53. Amy Nguyen @amyngyn Monitorama PDX 2017

  54. Amy Nguyen @amyngyn Monitorama PDX 2017 Sharing what you know

    1. Education vs intuition: Don't overload people with too much information. 2. Best practices: Use your expertise to determine the most helpful default behavior.
  55. Amy Nguyen @amyngyn Monitorama PDX 2017

  56. Amy Nguyen @amyngyn Monitorama PDX 2017 Sharing what you know

    1. Education vs intuition: Don't overload people with too much information. 2. Best practices: Use your expertise to determine the most helpful default behavior. 3. Potential pitfalls: Make it hard to do the wrong thing.
  57. Amy Nguyen @amyngyn Monitorama PDX 2017

  58. Amy Nguyen @amyngyn Monitorama PDX 2017 It looks like you're

    trying to alert on the most recent data. Are you sure you want to do that?
  59. Amy Nguyen @amyngyn Monitorama PDX 2017 - Background - Motivation

    - Sharing what you know - Designing what your users want - Recap Agenda
  60. Amy Nguyen @amyngyn Monitorama PDX 2017 Designing what your users

    want 1. Performance: Do whatever it takes to make it fast.
  61. Amy Nguyen @amyngyn Monitorama PDX 2017 Performance: Low Hanging Fruit!

  62. Amy Nguyen @amyngyn Monitorama PDX 2017 Performance: Low Hanging Fruit!

    - Backend
  63. Amy Nguyen @amyngyn Monitorama PDX 2017 Performance: Low Hanging Fruit!

    - Backend - Roll-up data over long time ranges
  64. Amy Nguyen @amyngyn Monitorama PDX 2017 Performance: Low Hanging Fruit!

    - Backend - Roll-up data over long time ranges - Store latest data in memory (e.g., Facebook's Gorilla paper and Beringei project)
  65. Amy Nguyen @amyngyn Monitorama PDX 2017 Performance: Low Hanging Fruit!

    - Backend - Roll-up data over long time ranges - Store latest data in memory (e.g., Facebook's Gorilla paper and Beringei project) - Add a cache layer (e.g., Turn's Splicer project)
  66. Amy Nguyen @amyngyn Monitorama PDX 2017 Performance: Low Hanging Fruit!

    - Backend - Roll-up data over long time ranges - Store latest data in memory (e.g., Facebook's Gorilla paper and Beringei project) - Add a cache layer (e.g., Turn's Splicer project) yeah sure get back to us in 6 months
  67. Amy Nguyen @amyngyn Monitorama PDX 2017 Performance: Low Hanging Fruit!

    - Backend - Roll-up data over long time ranges - Store latest data in memory (e.g., Facebook's Gorilla paper and Beringei project) - Add a cache layer (e.g., Turn's Splicer project) - Frontend
  68. Amy Nguyen @amyngyn Monitorama PDX 2017 Performance: Low Hanging Fruit!

    - Backend - Roll-up data over long time ranges - Store latest data in memory (e.g., Facebook's Gorilla paper and Beringei project) - Add a cache layer (e.g., Turn's Splicer project) - Frontend - Don't reload existing data if the user changes the time window
  69. Amy Nguyen @amyngyn Monitorama PDX 2017 Performance: Low Hanging Fruit!

    - Backend - Roll-up data over long time ranges - Store latest data in memory (e.g., Facebook's Gorilla paper and Beringei project) - Add a cache layer (e.g., Turn's Splicer project) - Frontend - Don't reload existing data if the user changes the time window - Prevent the user from requesting the data incessantly
  70. Amy Nguyen @amyngyn Monitorama PDX 2017 Performance: Low Hanging Fruit!

    - Backend - Roll-up data over long time ranges - Store latest data in memory (e.g., Facebook's Gorilla paper and Beringei project) - Add a cache layer (e.g., Turn's Splicer project) - Frontend - Don't reload existing data if the user changes the time window - Prevent the user from requesting the data incessantly - Lazy-load graphs on a dashboard
  71. Amy Nguyen @amyngyn Monitorama PDX 2017 Performance: Low Hanging Fruit!

    - Backend - Roll-up data over long time ranges - Store latest data in memory (e.g., Facebook's Gorilla paper and Beringei project) - Add a cache layer (e.g., Turn's Splicer project) - Frontend - Don't reload existing data if the user changes the time window - Prevent the user from requesting the data incessantly - Lazy-load graphs on a dashboard - Disclaimer: We haven't done all of these things.
  72. Amy Nguyen @amyngyn Monitorama PDX 2017 Designing what your users

    want 1. Performance: Do whatever it takes to make it fast. 2. Exploration: Make it easy to try things without fear.
  73. Amy Nguyen @amyngyn Monitorama PDX 2017

  74. Amy Nguyen @amyngyn Monitorama PDX 2017

  75. Amy Nguyen @amyngyn Monitorama PDX 2017

  76. Amy Nguyen @amyngyn Monitorama PDX 2017

  77. Amy Nguyen @amyngyn Monitorama PDX 2017 Designing what your users

    want 1. Performance: Do whatever it takes to make it fast. 2. Exploration: Make it easy to try things without fear. 3. Simplicity: Make it easy to figure out what to do.
  78. Amy Nguyen @amyngyn Monitorama PDX 2017

  79. Amy Nguyen @amyngyn Monitorama PDX 2017

  80. Amy Nguyen @amyngyn Monitorama PDX 2017

  81. Amy Nguyen @amyngyn Monitorama PDX 2017 manually type in your

    metric if you know the name somehow
  82. Amy Nguyen @amyngyn Monitorama PDX 2017 manually type in your

    metric if you know the name somehow aggregator
  83. Amy Nguyen @amyngyn Monitorama PDX 2017 manually type in your

    metric if you know the name somehow aggregator hope you remember what tags are available for this metric lol
  84. Amy Nguyen @amyngyn Monitorama PDX 2017 manually type in your

    metric if you know the name somehow aggregator secret bonus: you can downsample?! hope you remember what tags are available for this metric lol
  85. Amy Nguyen @amyngyn Monitorama PDX 2017 manually type in your

    metric if you know the name somehow aggregator secret bonus: you can downsample?! hope you remember what tags are available for this metric lol
  86. Amy Nguyen @amyngyn Monitorama PDX 2017

  87. Amy Nguyen @amyngyn Monitorama PDX 2017

  88. Amy Nguyen @amyngyn Monitorama PDX 2017

  89. Amy Nguyen @amyngyn Monitorama PDX 2017 available options you probably

    don't need to touch
  90. Amy Nguyen @amyngyn Monitorama PDX 2017 manual entry for power

    users
  91. Amy Nguyen @amyngyn Monitorama PDX 2017 relevant information needed to

    create a query!
  92. Amy Nguyen @amyngyn Monitorama PDX 2017 Designing what your users

    want 1. Performance: Do whatever it takes to make it fast. 2. Exploration: Make it easy to try things without fear. 3. Simplicity: Make it easy to figure out what to do.
  93. Amy Nguyen @amyngyn Monitorama PDX 2017 - Background - Motivation

    - Sharing what you know - Designing what your users want - Recap Agenda
  94. Amy Nguyen @amyngyn Monitorama PDX 2017 "The fastest way to

    become a 10x engineer is to help 10 other engineers do their jobs better. - Wayne Gretzky" - Michael Scott
  95. Amy Nguyen @amyngyn Monitorama PDX 2017 UX and Your Situation

    team documentation tools probably paying for vendors TBH? this talk things we can all control
  96. Amy Nguyen @amyngyn Monitorama PDX 2017 Sharing what you know

    1. Education vs intuition: Don't overload people with too much information. 2. Best practices: Use your expertise to determine the most helpful default behavior. 3. Potential pitfalls: Make it hard to do the wrong thing.
  97. Amy Nguyen @amyngyn Monitorama PDX 2017 Designing what your users

    want 1. Performance: Do whatever it takes to make it fast. 2. Exploration: Make it easy to try things without fear. 3. Simplicity: Make it easy to figure out what to do.
  98. Amy Nguyen @amyngyn Monitorama PDX 2017 Thanks!