Upgrade to Pro — share decks privately, control downloads, hide ads and more …

UX Design and Education for Effective Monitoring Tools (TechSummit Amsterdam 2017)

UX Design and Education for Effective Monitoring Tools (TechSummit Amsterdam 2017)

I, like many of us, chose to work in infrastructure because I thought it meant I could avoid talking to other people as much as possible. I was wrong. It turns out that a huge amount of work in monitoring is about empowering the rest of your engineering organization to use the tools you develop correctly, quickly, and effectively. We spend so much time explaining how to interpret timeseries data and why averaging percentiles is a bad idea. After feedback from our engineers, we embarked on a journey to redesign our internal monitoring tools and understand where people were struggling with the existing system.

In this talk, I will explain how we approached the problem of making concepts like interpolation, aggregation, and alerting more intuitive and how we identified pain points for new users. I will outline common misconceptions our users have about monitoring and how we cleared up this confusion in our UI without forcing everyone to spend hours on documentation. Rather than copying and pasting existing UX design principles onto our monitoring problems, we will see how we can reinterpret these ideas and apply them to our unique situation to create a better experience for everyone.

Amy Nguyen

June 01, 2017
Tweet

More Decks by Amy Nguyen

Other Decks in Technology

Transcript

  1. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    UX Design and Education for
    Effective Monitoring Tools

    View full-size slide

  2. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Agenda
    - Background
    - Motivation
    - Sharing what you know
    - Designing what your users want
    - Recap

    View full-size slide

  3. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Agenda
    - Background
    - Motivation
    - Sharing what you know
    - Designing what your users want
    - Recap

    View full-size slide

  4. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Hi!

    View full-size slide

  5. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Hi!
    - Engineer at Pinterest since 2015

    View full-size slide

  6. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Hi!
    - Engineer at Pinterest since 2015
    - Recent projects:

    View full-size slide

  7. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Hi!
    - Engineer at Pinterest since 2015
    - Recent projects:
    ○ tracing for performance

    View full-size slide

  8. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Hi!
    - Engineer at Pinterest since 2015
    - Recent projects:
    ○ tracing for performance
    ○ D3 data visualizations

    View full-size slide

  9. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Hi!
    - Engineer at Pinterest since 2015
    - Recent projects:
    ○ tracing for performance
    ○ D3 data visualizations
    ○ cache for OpenTSDB data

    View full-size slide

  10. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Hi!
    - Engineer at Pinterest since 2015
    - Recent projects:
    ○ tracing for performance
    ○ D3 data visualizations
    ○ cache for OpenTSDB data
    ○ so much documentation omg

    View full-size slide

  11. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Hi!
    - Engineer at Pinterest since 2015
    - Recent projects:
    ○ tracing for performance
    ○ D3 data visualizations
    ○ cache for OpenTSDB data
    ○ so much documentation omg
    - amynguyen.net
    @amyngyn
    pinterest.com/amyngyn

    View full-size slide

  12. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    About Pinterest
    the world's first visual discovery engine

    View full-size slide

  13. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    About Pinterest
    - 175 million monthly active users
    the world's first visual discovery engine

    View full-size slide

  14. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    About Pinterest
    - 175 million monthly active users
    - 100 billion pins
    the world's first visual discovery engine

    View full-size slide

  15. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    About Pinterest
    - 175 million monthly active users
    - 100 billion pins
    - 2 billion boards
    the world's first visual discovery engine

    View full-size slide

  16. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    About Pinterest
    - 175 million monthly active users
    - 100 billion pins
    - 2 billion boards
    - 2 billion searches every month
    the world's first visual discovery engine

    View full-size slide

  17. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    About Pinterest
    - 175 million monthly active users
    - 100 billion pins
    - 2 billion boards
    - 2 billion searches every month
    - 150,000 requests served per second
    the world's first visual discovery engine

    View full-size slide

  18. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    About Pinterest Monitoring
    the world's first greatest monitoring team

    View full-size slide

  19. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    About Pinterest Monitoring
    - Graphite, OpenTSDB, Kafka, Storm,
    Spark, ELK, Sumo Logic, Zipkin
    the world's first greatest monitoring team

    View full-size slide

  20. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    About Pinterest Monitoring
    - Graphite, OpenTSDB, Kafka, Storm,
    Spark, ELK, Sumo Logic, Zipkin
    - 100 terabytes logged per day
    the world's first greatest monitoring team

    View full-size slide

  21. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    About Pinterest Monitoring
    - Graphite, OpenTSDB, Kafka, Storm,
    Spark, ELK, Sumo Logic, Zipkin
    - 100 terabytes logged per day
    - 2.5M metrics ingested per second
    the world's first greatest monitoring team

    View full-size slide

  22. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    About Pinterest Monitoring
    - Graphite, OpenTSDB, Kafka, Storm,
    Spark, ELK, Sumo Logic, Zipkin
    - 100 terabytes logged per day
    - 2.5M metrics ingested per second
    - Over 400 engineers
    the world's first greatest monitoring team

    View full-size slide

  23. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    About Pinterest Monitoring
    - Graphite, OpenTSDB, Kafka, Storm,
    Spark, ELK, Sumo Logic, Zipkin
    - 100 terabytes logged per day
    - 2.5M metrics ingested per second
    - Over 400 engineers customers!
    the world's first greatest monitoring team

    View full-size slide

  24. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Our Tools: Dashboards

    View full-size slide

  25. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Our Tools: Graph Exploration

    View full-size slide

  26. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Our Tools: Alerting

    View full-size slide

  27. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Agenda
    - Background
    - Motivation
    - Sharing what you know
    - Designing what your users want
    - Recap

    View full-size slide

  28. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Why should we care about
    user experience?

    View full-size slide

  29. Amy Nguyen @amyngyn TechSummit Amsterdam 2017

    View full-size slide

  30. Amy Nguyen @amyngyn TechSummit Amsterdam 2017

    View full-size slide

  31. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Why should we care about user experience?

    View full-size slide

  32. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Why should we care about user experience?
    - Prevent misunderstandings: not everyone is (or should
    have to be) an expert at interpreting monitoring data

    View full-size slide

  33. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Why should we care about user experience?
    - Prevent misunderstandings: not everyone is (or should
    have to be) an expert at interpreting monitoring data
    - Developer velocity: help people reach conclusions
    faster, help your company move faster

    View full-size slide

  34. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Why should we care about user experience?
    - Prevent misunderstandings: not everyone is (or should
    have to be) an expert at interpreting monitoring data
    - Developer velocity: help people reach conclusions
    faster, help your company move faster
    - Data democracy: you don't know what questions
    people want to answer with their own data

    View full-size slide

  35. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    "The fastest way to become a 10x engineer is
    to help 10 other engineers do their jobs better.
    - Wayne Gretzky"
    - Michael Scott

    View full-size slide

  36. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Why should we care about user experience?
    - Prevent misunderstandings: not everyone is (or should
    have to be) an expert at interpreting timeseries data
    - Developer velocity: help people reach conclusions
    faster, help your company move faster
    - Data democracy: you don't know what questions
    people want to answer with their own data

    View full-size slide

  37. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Why should we care about user experience?
    - Prevent misunderstandings: not everyone is (or should
    have to be) an expert at interpreting timeseries data
    - Developer velocity: help people reach conclusions
    faster, help your company move faster
    - Data democracy: you don't know what questions
    people want to answer with their own data

    View full-size slide

  38. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Why should we care about user experience?
    Because we can help engineers work correctly, quickly,
    and independently.

    View full-size slide

  39. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    UX and Your Situation

    View full-size slide

  40. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    UX and Your Situation
    team

    View full-size slide

  41. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    UX and Your Situation
    team
    documentation

    View full-size slide

  42. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    UX and Your Situation
    team
    documentation
    tools

    View full-size slide

  43. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    UX and Your Situation
    team
    documentation
    tools
    probably paying for
    vendors TBH?

    View full-size slide

  44. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    UX and Your Situation
    team
    documentation
    tools
    probably paying for
    vendors TBH?
    this talk

    View full-size slide

  45. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    UX and Your Situation
    team
    documentation
    tools
    probably paying for
    vendors TBH?
    this talk
    things we can
    all control

    View full-size slide

  46. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    - Background
    - Motivation
    - Sharing what you know
    - Designing what your users want
    - Recap
    Agenda

    View full-size slide

  47. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Sharing what you know

    View full-size slide

  48. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Sharing what you know
    1. Education vs intuition: Don't overload people with too
    much information.

    View full-size slide

  49. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    stats.example.metric.errors > 5

    View full-size slide

  50. Amy Nguyen @amyngyn TechSummit Amsterdam 2017

    View full-size slide

  51. Amy Nguyen @amyngyn TechSummit Amsterdam 2017

    View full-size slide

  52. Amy Nguyen @amyngyn TechSummit Amsterdam 2017

    View full-size slide

  53. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Sharing what you know
    1. Education vs intuition: Don't overload people with too
    much information.
    2. Best practices: Use your expertise to determine the
    most helpful default behavior.

    View full-size slide

  54. Amy Nguyen @amyngyn TechSummit Amsterdam 2017

    View full-size slide

  55. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Sharing what you know
    1. Education vs intuition: Don't overload people with too
    much information.
    2. Best practices: Use your expertise to determine the
    most helpful default behavior.
    3. Potential pitfalls: Make it hard to do the wrong thing.

    View full-size slide

  56. Amy Nguyen @amyngyn TechSummit Amsterdam 2017

    View full-size slide

  57. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    It looks like you're trying
    to alert on the most
    recent data.
    Are you sure you want
    to do that?

    View full-size slide

  58. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    - Background
    - Motivation
    - Sharing what you know
    - Designing what your users want
    - Recap
    Agenda

    View full-size slide

  59. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Designing what your users want
    1. Performance: Do whatever it takes to make it fast.

    View full-size slide

  60. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Performance: Low Hanging Fruit!

    View full-size slide

  61. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Performance: Low Hanging Fruit!
    - Backend

    View full-size slide

  62. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Performance: Low Hanging Fruit!
    - Backend
    - Roll-up data over long time ranges

    View full-size slide

  63. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Performance: Low Hanging Fruit!
    - Backend
    - Roll-up data over long time ranges
    - Store latest data in memory (e.g., Facebook's Gorilla paper and
    Beringei project)

    View full-size slide

  64. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Performance: Low Hanging Fruit!
    - Backend
    - Roll-up data over long time ranges
    - Store latest data in memory (e.g., Facebook's Gorilla paper and
    Beringei project)
    - Add a cache layer (e.g., Turn's Splicer project)

    View full-size slide

  65. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Performance: Low Hanging Fruit!
    - Backend
    - Roll-up data over long time ranges
    - Store latest data in memory (e.g., Facebook's Gorilla paper and
    Beringei project)
    - Add a cache layer (e.g., Turn's Splicer project)
    yeah sure get back to
    us in 6 months

    View full-size slide

  66. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Performance: Low Hanging Fruit!
    - Backend
    - Roll-up data over long time ranges
    - Store latest data in memory (e.g., Facebook's Gorilla paper and
    Beringei project)
    - Add a cache layer (e.g., Turn's Splicer project)
    - Frontend

    View full-size slide

  67. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Performance: Low Hanging Fruit!
    - Backend
    - Roll-up data over long time ranges
    - Store latest data in memory (e.g., Facebook's Gorilla paper and
    Beringei project)
    - Add a cache layer (e.g., Turn's Splicer project)
    - Frontend
    - Don't reload existing data if the user changes the time window

    View full-size slide

  68. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Performance: Low Hanging Fruit!
    - Backend
    - Roll-up data over long time ranges
    - Store latest data in memory (e.g., Facebook's Gorilla paper and
    Beringei project)
    - Add a cache layer (e.g., Turn's Splicer project)
    - Frontend
    - Don't reload existing data if the user changes the time window
    - Prevent the user from requesting the data incessantly

    View full-size slide

  69. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Performance: Low Hanging Fruit!
    - Backend
    - Roll-up data over long time ranges
    - Store latest data in memory (e.g., Facebook's Gorilla paper and
    Beringei project)
    - Add a cache layer (e.g., Turn's Splicer project)
    - Frontend
    - Don't reload existing data if the user changes the time window
    - Prevent the user from requesting the data incessantly
    - Lazy-load graphs on a dashboard

    View full-size slide

  70. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Performance: Low Hanging Fruit!
    - Backend
    - Roll-up data over long time ranges
    - Store latest data in memory (e.g., Facebook's Gorilla paper and
    Beringei project)
    - Add a cache layer (e.g., Turn's Splicer project)
    - Frontend
    - Don't reload existing data if the user changes the time window
    - Prevent the user from requesting the data incessantly
    - Lazy-load graphs on a dashboard
    - Disclaimer: We haven't done all of these things.

    View full-size slide

  71. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Designing what your users want
    1. Performance: Do whatever it takes to make it fast.
    2. Exploration: Make it easy to try things without fear.

    View full-size slide

  72. Amy Nguyen @amyngyn TechSummit Amsterdam 2017

    View full-size slide

  73. Amy Nguyen @amyngyn TechSummit Amsterdam 2017

    View full-size slide

  74. Amy Nguyen @amyngyn TechSummit Amsterdam 2017

    View full-size slide

  75. Amy Nguyen @amyngyn TechSummit Amsterdam 2017

    View full-size slide

  76. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Designing what your users want
    1. Performance: Do whatever it takes to make it fast.
    2. Exploration: Make it easy to try things without fear.
    3. Simplicity: Make it easy to figure out what to do.

    View full-size slide

  77. Amy Nguyen @amyngyn TechSummit Amsterdam 2017

    View full-size slide

  78. Amy Nguyen @amyngyn TechSummit Amsterdam 2017

    View full-size slide

  79. Amy Nguyen @amyngyn TechSummit Amsterdam 2017

    View full-size slide

  80. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    manually type in your metric if
    you know the name somehow

    View full-size slide

  81. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    manually type in your metric if
    you know the name somehow
    aggregator

    View full-size slide

  82. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    manually type in your metric if
    you know the name somehow
    aggregator hope you remember what tags are
    available for this metric lol

    View full-size slide

  83. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    manually type in your metric if
    you know the name somehow
    aggregator
    secret bonus:
    you can downsample?!
    hope you remember what tags are
    available for this metric lol

    View full-size slide

  84. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    manually type in your metric if
    you know the name somehow
    aggregator
    secret bonus:
    you can downsample?!
    hope you remember what tags are
    available for this metric lol

    View full-size slide

  85. Amy Nguyen @amyngyn TechSummit Amsterdam 2017

    View full-size slide

  86. Amy Nguyen @amyngyn TechSummit Amsterdam 2017

    View full-size slide

  87. Amy Nguyen @amyngyn TechSummit Amsterdam 2017

    View full-size slide

  88. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    available options
    you probably don't
    need to touch

    View full-size slide

  89. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    manual entry
    for power users

    View full-size slide

  90. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    relevant information
    needed to create a query!

    View full-size slide

  91. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Designing what your users want
    1. Performance: Do whatever it takes to make it fast.
    2. Exploration: Make it easy to try things without fear.
    3. Simplicity: Make it easy to figure out what to do.

    View full-size slide

  92. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    - Background
    - Motivation
    - Sharing what you know
    - Designing what your users want
    - Recap
    Agenda

    View full-size slide

  93. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    "The fastest way to become a 10x engineer is
    to help 10 other engineers do their jobs better.
    - Wayne Gretzky"
    - Michael Scott

    View full-size slide

  94. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    UX and Your Situation
    team
    documentation
    tools
    probably paying for
    vendors TBH?
    this talk
    things we can
    all control

    View full-size slide

  95. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Sharing what you know
    1. Education vs intuition: Don't overload people with too
    much information.
    2. Best practices: Use your expertise to determine the
    most helpful default behavior.
    3. Potential pitfalls: Make it hard to do the wrong thing.

    View full-size slide

  96. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Designing what your users want
    1. Performance: Do whatever it takes to make it fast.
    2. Exploration: Make it easy to try things without fear.
    3. Simplicity: Make it easy to figure out what to do.

    View full-size slide

  97. Amy Nguyen @amyngyn TechSummit Amsterdam 2017
    Thanks!

    View full-size slide