Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Observability For Serverless Workloads

96af4f8a8f4d2fdf5827030da8848ca1?s=47 Ara
February 21, 2020

Observability For Serverless Workloads

What is it observability? How we can improve observability when it comes to serverless workloads?

96af4f8a8f4d2fdf5827030da8848ca1?s=128

Ara

February 21, 2020
Tweet

Transcript

  1. Observability for Serverless Workloads Ara Pulido @arapulido

  2. Developer Advocate at Datadog @arapulido at Twitter ara.pulido@datadoghq.com About me

  3. Observability

  4. Observability Microservices Containers FaaS Serverless CI/CD ML Big Data

  5. A bit of history

  6. Monoliths

  7. Different user expectations

  8. Infrastructure monitoring

  9. Oops!

  10. Microservices

  11. Infrastructure monitoring is no longer enough

  12. Observability A measure of how well internal states of a

    system can be inferred from knowledge of its external outputs
  13. Logs Metrics Tracing

  14. Logs Metrics Tracing Events User/Browser tests Data visualization ML Queries

  15. Serverless

  16. NO ACCESS TO THE UNDERLYING OS CHARGED BY EXECUTION TIME

    / MEMORY ALLOCATED IN MANY CLOUDS, RUNTIMES ARE A BIT OF A BLACKBOX Serverless
  17. Logs Metrics Tracing Demo application https://dtdg.co/faas-sample

  18. Demo application 1 face - No duplicates No faces >

    1 face Duplicated
  19. Demo application Search Faces Web app Index Faces Persist Data

    Detect Faces
  20. Demo application Detect Faces Search Faces Web app Index Faces

    Persist Data
  21. Demo application Search Faces Web app Index Faces Persist Data

    Detect Faces
  22. Demo application Search Faces Web app Index Faces Persist Data

    Detect Faces
  23. Demo application Search Faces Web app Index Faces Persist Data

    Detect Faces
  24. Demo application Search Faces Web app Index Faces Persist Data

    Detect Faces
  25. Logs Metrics Tracing

  26. THE PLATFORM LOGS STDOUT, STDERR LOG AS MUCH AS POSSIBLE

    (PART OF THE PLATFORM) Logs USE LOG FORWARDERS TO COLLECT THEM ELSEWHERE USE YOUR LANGUAGE LOGGING LIBRARY TO DO MORE COMPLEX STUFF
  27. Logs Cloud Log System Log Forwarder Log system

  28. Logs

  29. Logs

  30. Logs Metrics Tracing

  31. Collecting Metrics

  32. Collecting metrics Send batch Send batch Send batch

  33. Cold start Cold start Execution context Collecting metrics

  34. Cold start EXECUTION CONTEXT SHARES DISK ACROSS INVOCATIONS Cold start

    Execution context Collecting metrics
  35. Cold start EXECUTION CONTEXT SHARES DISK ACROSS INVOCATIONS Cold start

    Send batch Execution context Collecting metrics
  36. Cold start WE CANNOT KNOW WHEN IT IS GOING TO

    BE GARBAGE COLLECTED Cold start Send batch Execution context Collecting metrics EXECUTION CONTEXT SHARES DISK ACROSS INVOCATIONS Send batch
  37. Metrics in Logs Cloud Log System Log Forwarder Metrics

  38. Metrics in Logs Cloud Log System Log Forwarder Metrics

  39. Infrastructure Metrics

  40. 4 golden signals LATENCY TRAFFIC SATURATION ERRORS

  41. DURATION INVOCATIONS THROTTLE ERRORS 4 golden signals

  42. Serverless specific COLD STARTS WARM STARTS

  43. BILLED DURATION - DURATION Serverless specific

  44. Use metrics to save $$$ (and time) ESTIMATED COST PER

    FUNCTION
  45. Use metrics to save $$$ (and time) 700ms 500ms 300ms

    128MB 192MB 320MB 128MB = $ 0.000001465 192MB = $ 0.000001565 320MB = $ 0.000001563
  46. Business Metrics

  47. EACH FUNCTION IS A POTENTIAL BUSINESS METRIC GATHER THOSE AND

    USE THEM TO IMPROVE YOUR BUSINESS Business metrics
  48. Business metrics

  49. Logs Metrics Tracing

  50. None
  51. S1 S2 S3 S4 S5

  52. S1 S2 S3 S4 S5 TRACE SPANS

  53. Instrumentation with traces

  54. No OS / No agent

  55. No OS / No agent Use cloud specific libraries for

    your functions
  56. X-Ray traces

  57. No OS / No agent Use cloud specific libraries for

    your functions … but make sure you don’t break your current traces
  58. Demo application Search Faces Web app Index Faces Persist Data

    Detect Faces
  59. Full app trace

  60. Full app trace

  61. Full app trace

  62. Full app trace

  63. Full app trace

  64. Full app trace

  65. No OS / No agent Use cloud specific libraries for

    your functions … but make sure you don’t break your current traces or use logs!!
  66. Logs Metrics Tracing

  67. Take aways

  68. USE LOGS FOR METRICS TO AVOID LOSING DATA ADD YOUR

    FUNCTIONS TO YOUR CURRENT TRACES (if possible) LOGS ARE CHEAP. LOG A LOT. TRACK THOSE BUSINESS METRICS
  69. Thank you! (we are hiring!) Ara Pulido @arapulido Demo app:

    https://dtdg.co/faas-sample