Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Map & Territory: A story of visibility

Map & Territory: A story of visibility

Pierre-Yves Ritschard

April 19, 2013
Tweet

More Decks by Pierre-Yves Ritschard

Other Decks in Technology

Transcript

  1. Map & Territory
    a story of visibility

    View full-size slide

  2. Pierre-Yves
    @pyr
    https://github.com/pyr

    View full-size slide

  3. https://exoscale.ch

    View full-size slide

  4. How do we work ?

    View full-size slide

  5. How do we
    improve?

    View full-size slide

  6. Avoid Shortcuts!

    View full-size slide

  7. We want lower
    defect rates

    View full-size slide

  8. We want to make
    informed decisions

    View full-size slide

  9. Design
    Build
    Live

    View full-size slide

  10. Extracting meaningful
    state data from
    heterogeneous event
    sources, over time

    View full-size slide

  11. Meaningful
    (relates to business value)

    View full-size slide

  12. State Data
    (structured payload)

    View full-size slide

  13. Heterogeneous
    (everyone is involved)

    View full-size slide

  14. Over time
    (tracking)

    View full-size slide

  15. How does it help
    my system's
    lifecycle ?

    View full-size slide

  16. Map
    =/=
    Territory

    View full-size slide

  17. Break out of our
    mental model

    View full-size slide

  18. "I'll push this
    minor change, it
    cannot do any
    harm"

    View full-size slide

  19. "I'll just add this
    static route"

    View full-size slide

  20. Better lifecycle
    Informed decisions
    Better maps

    View full-size slide

  21. Systems are
    (increasingly)
    complex

    View full-size slide

  22. Web Infrastructure
    circa 00
    (2 servers)

    View full-size slide

  23. Visibility Circa '00

    View full-size slide

  24. Web Infrastructure
    circa '12
    (27 nodes)

    View full-size slide

  25. Visibility Circa '12

    View full-size slide

  26. Q: how is business
    doing today ?
    A:

    View full-size slide

  27. Q: how is business
    doing today ?
    A: based on these
    key metrics we're
    looking good

    View full-size slide

  28. Figure out those
    key metrics

    View full-size slide

  29. We need
    appropriate tooling

    View full-size slide

  30. events across:
    system,
    components,
    software

    View full-size slide

  31. The event stream
    approach

    View full-size slide

  32. Plenty of small
    producers
    Few big consumers

    View full-size slide

  33. Production:
    Anything that
    happens or
    moves (logs
    too!):
    Normalize &
    Stream

    View full-size slide

  34. Consumption:
    Aggregate
    Correlate
    Decide

    View full-size slide

  35. Aggregation
    compute compound
    metrics (ratios, sums)

    View full-size slide

  36. Decision
    track, alert, ignore,
    scale

    View full-size slide

  37. Implementing
    on premise, saas or in
    between ?

    View full-size slide

  38. SaaS
    loggly, papertrail,
    librato, datadog, ...

    View full-size slide

  39. On Premise
    collectd, logstash,
    graphite, statsd,
    riemann

    View full-size slide

  40. The path to visibility:
    Find key metrics
    Find the right tools
    Rely on an event stream
    Involve everyone
    Challenge your mental model
    Hopefully, improve quality and lower
    defect rates in the process!

    View full-size slide