$30 off During Our Annual Pro Sale. View Details »

Monitoring and Debugging Containers

JBD
December 04, 2018

Monitoring and Debugging Containers

JBD

December 04, 2018
Tweet

More Decks by JBD

Other Decks in Programming

Transcript

  1. @rakyll
    monitoring and debugging
    containerized systems
    Jaana B. Dogan, Google
    [email protected]

    View Slide

  2. @rakyll
    me
    overly frustrated engineer
    15+ years in networking systems
    making systems more reliable

    View Slide

  3. @rakyll
    the new old monitoring?
    (maybe)

    View Slide

  4. @rakyll
    systems are growing...
    and you are not in control

    View Slide

  5. @rakyll
    bare metal
    kernel
    network stack
    cloud stack
    libraries
    frameworks
    your code

    View Slide

  6. @rakyll

    View Slide

  7. @rakyll
    complexity is inevitable

    View Slide

  8. @rakyll
    container

    View Slide

  9. @rakyll
    container

    View Slide

  10. @rakyll
    container container

    View Slide

  11. @rakyll
    container container

    View Slide

  12. @rakyll
    container container
    message queue

    View Slide

  13. @rakyll
    container container
    storage/database

    View Slide

  14. @rakyll
    container container
    load balancer
    location=us-west location=europe-central

    View Slide

  15. @rakyll
    host
    host
    container container
    load balancer

    View Slide

  16. @rakyll
    container container
    container
    container
    container
    orchestrated hot mess

    View Slide

  17. @rakyll
    areas of issues:
    - lack of locality
    - networking
    - scheduling
    - dependencies

    View Slide

  18. @rakyll
    bare metal
    kernel
    network stack
    cloud stack
    libraries
    frameworks
    your code

    View Slide

  19. @rakyll
    “my job is done here”

    View Slide

  20. @rakyll
    after going to production...
    1. monitor
    2. alert
    3. troubleshoot
    4. fix

    View Slide

  21. @rakyll

    View Slide

  22. @rakyll
    load balancer

    View Slide

  23. @rakyll
    load balancer
    critical path

    View Slide

  24. @rakyll
    discovering critical paths
    making them reliable then fast
    making them debuggable

    View Slide

  25. @rakyll

    View Slide

  26. @rakyll
    Latency Numbers Every Programmer Should Know by Jeff Dean

    View Slide

  27. @rakyll

    View Slide

  28. @rakyll
    ping pong
    pongservice:6996
    project: ping the pong server.

    View Slide

  29. @rakyll
    opencensus.io

    View Slide

  30. @rakyll
    not my team!

    View Slide

  31. @rakyll
    where is the source code?

    View Slide

  32. @rakyll
    who to page?

    View Slide

  33. @rakyll
    who to page?

    View Slide

  34. @rakyll
    give me the logs, runtime
    events, profiles...

    View Slide

  35. @rakyll

    View Slide

  36. @rakyll

    View Slide

  37. @rakyll

    View Slide

  38. @rakyll
    http://server:9999/tracez

    View Slide

  39. @rakyll
    challenges...

    View Slide

  40. @rakyll
    no wire standards

    View Slide

  41. @rakyll

    View Slide

  42. @rakyll
    traceparent: ---
    Example:
    traceparent: 00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01

    View Slide

  43. @rakyll
    no export standards

    View Slide

  44. @rakyll
    areas of issues:
    - locality
    - networking
    - scheduling
    - dependencies

    View Slide

  45. @rakyll
    fin
    [email protected]

    View Slide