Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Logging in the age of Microservices and the Cloud

2d505d3fd867e284a384986533c2e5f8?s=47 Axel Fontaine
September 13, 2018

Logging in the age of Microservices and the Cloud

The days of the statically partitioned datacenter are over. Welcome to the modern world of microservices and auto-scaling in the cloud. Requests flow through multiple services, individual services are auto-scaled and machines are short-lived.

This is a brave new world and it is time to change the way we design and architect our software to better deal with it. In this talk we'll look at logging and we'll take a deep dive into the challenges involved into moving from the old "SSH and tail -f" world to a world of centralized and structured logs, consumable both by humans and machines.

This session is for developers and architects looking for battle-tested solutions to implement effective logging for microservices in an auto-scaling world.

2d505d3fd867e284a384986533c2e5f8?s=128

Axel Fontaine

September 13, 2018
Tweet

More Decks by Axel Fontaine

Other Decks in Technology

Transcript

  1. Logging in the age of @axelfontaine Cloud Microservices and the

  2. Axel Fontaine @axelfontaine flywaydb.org boxfuse.com

  3. POLL: what type of infrastructure are you running on? •

    On Premise • Cloud
  4. The (good) old days of logging …

  5. LOG file ssh me@myserver tail -f server.log

  6. Looks great!

  7. Thanks ! @axelfontaine boxfuse.com

  8. LOG file ssh me@myserver tail -f server.log

  9. Times have changed …

  10. The new reality Cloud Microservices

  11. But first, back to the fundamental question...

  12. Why are we logging? Postmortem analysis of user activity and

    programming errors Powerful debugging tool Should contain answers to important questions: What? Who? Where? When?
  13. What? Who? Where? When?

  14. What? Message, Code, Severity Who? Where? When?

  15. What? Message, Code, Severity Who? Account, User, Session, Request Where?

    When?
  16. What? Message, Code, Severity Who? Account, User, Session, Request Where?

    App, Module, Class When?
  17. What? Message, Code, Severity Who? Account, User, Session, Request Where?

    App, Module, Class When? Timestamp, Hostname, PID, Thread How can these questions be asked? How can all this information be captured?
  18. Capturing log info

  19. Logging framework architecture Your Code Logger Appender A Appender B

    Storage B Storage A
  20. logger.info(“my log message”); What? Message, Code, Severity Who? Account, User,

    Session, Request Where? App, Module, Class When? Timestamp, Hostname, PID, Thread
  21. logger.info(“my log message”); What? Message, Code, Severity Who? Account, User,

    Session, Request Where? App, Module, Class When? Timestamp, Hostname, PID, Thread
  22. logger.info(“my log message”); What? Message, Code, Severity Who? Account, User,

    Session, Request Where? App, Module, Class When? Timestamp, Hostname, PID, Thread
  23. Your Code Logger Appender A Appender B Storage B Storage

    A MDC Mapped Diagnostic Context (Thread-local temporary key-value store)
  24. MDC.put(“account”, “company ABC”); MDC.put(“user”, “user123”); What? Message, Code, Severity Who?

    Account, User, Session, Request Where? App, Module, Class When? Timestamp, Hostname, PID, Thread
  25. MDC.put(“account”, “company ABC”); MDC.put(“user”, “user123”); … logger.info(“my log message”); What?

    Message, Code, Severity Who? Account, User, Session, Request Where? App, Module, Class When? Timestamp, Hostname, PID, Thread
  26. MDC.put(“account”, “company ABC”); MDC.put(“user”, “user123”); Populate when: ✓ a request

    enters the application ✓ a message is received from a queue ✓ a cron task starts ✓ making an async call to another thread And don’t forget to clear when done! (Threadpools reuse threads!)
  27. Querying the logs

  28. grep?

  29. Truncation! Compression! Single line messages! No MDC info!

  30. Your Code Logger Appender Storage (formatted) MDC Log Viewer FORMAT

    READ Decoupling log storage from log representation
  31. Your Code Logger Appender Storage (raw) MDC Log Viewer READ

    & FORMAT Structured logging
  32. { "account": "axelfontaine", "image": "axelfontaine/xyz:543", "instance": "i-0d843d5af9b366a69", "level": "INFO", "logger":

    "com.myapp.task.TaskService", "message": "Successfully killed axelfontaine/demo in prod", "request": "crq-7R2CVPUMKREUFLMQUE3XB7JWCX", "session": "cli-CRFM2IPABRFUJD7KTDYVDVXABX", "thread": "Thread-18710", "timestamp": "2017-05-12T10:20:30.444" } Structured logging
  33. Cloud

  34. Capacity Cost

  35. Spare Capacity (paying for something you don’t use) = Wasted

    Money https://www.flickr.com/photos/timothykrause/5677858694/
  36. Scaling = alarms + corrective actions (scaling in or out)

  37. Auto Scaling = automated alarms + automated corrective actions (scaling

    in or out)
  38. Load Balancer ssh me@myserver1 tail -f server.log ssh me@myserver2 tail

    -f server.log ssh me@myserver3 tail -f server.log LOG file LOG file LOG file CPU Load Scale Out Scale In
  39. Load Balancer ssh me@myserver1 tail -f server.log ssh me@myserver2 tail

    -f server.log ssh me@myserver3 tail -f server.log ssh me@myserver4 tail -f server.log LOG file LOG file LOG file LOG file Scale Out Scale In CPU Load
  40. Load Balancer ssh me@myserver1 tail -f server.log ssh me@myserver2 tail

    -f server.log ssh me@myserver3 tail -f server.log ssh me@myserver4 tail -f server.log LOG file LOG file LOG file LOG file Scale Out Scale In CPU Load
  41. Load Balancer ssh me@myserver1 tail -f server.log ssh me@myserver3 tail

    -f server.log ssh me@myserver4 tail -f server.log LOG file LOG file LOG file LOG file Scale Out Scale In CPU Load ssh me@myserver2 tail -f server.log
  42. Load Balancer ssh me@myserver1 tail -f server.log DATA LOSS ssh

    me@myserver3 tail -f server.log ssh me@myserver4 tail -f server.log LOG file LOG file LOG file LOG file Scale Out Scale In CPU Load
  43. Load Balancer LOG file LOG file LOG file log server

    where logs can be ✓ aggregated ✓ stored and backuped ✓ indexed ✓ searched
  44. log server where logs can be ✓ aggregated ✓ stored

    and backuped ✓ indexed ✓ searched Many options: • Logstash (ELK) • AWS CloudWatch Logs • Loggly • Papertrail • … Build or Buy? Almost always the better option, unless you have truly extreme requirements (you probably don't)
  45. or stdout Appender ✓ tightly integrated with logging framework ✓

    in-process ✓ direct MDC access ✓ best for homogenous environments ✓ universal ✓ separate process ✓ ingests serialized data with record separator ✓ best for heterogeneous environments
  46. Log Retention Time Cost Value Best Deal

  47. Log Levels Importance You want both when an important failure

    occurs! Detail DEBUG INFO WARNING ERROR What is missing: High water mark filtering!
  48. Microservices

  49. POLL: what type of architecture does your software have? •

    Integrated (Monolith) • Distributed (Microservices)
  50. log server

  51. Querying across systems

  52. None
  53. None
  54. A B C Create MDC (based on session) and assign

    unique request ID Copy MDC to HTTP(S) headers Read MDC HTTP(S) headers Read MDC HTTP(S) headers Copy MDC to HTTP(S) headers Propagating MDC
  55. Propagating MDC A B C Filter Decorator Filter Filter Decorator

    Two Implementation Options: • library (manual, precise control) • agent (automatic, risk over overreaching)
  56. Machine-readable logs

  57. Machine-queryable logs What? Message, Code, Severity Who? Account, User, Session,

    Request Where? App, Module, Class When? Timestamp, Hostname, PID, Thread Machine-readable logs
  58. AWS CloudWatch Logs

  59. { $.account = “axelfontaine“ && $.request = “crq-12345678” }

  60. { "account": "axelfontaine", "image": "axelfontaine/xyz:543", "instance": "i-0d843d5af9b366a69", "level": "INFO", "logger":

    "com.myapp.task.TaskService", "message": "Successfully killed axelfontaine/demo in prod", "request": "crq-7R2CVPUMKREUFLMQUE3XB7JWCX", "session": "cli-CRFM2IPABRFUJD7KTDYVDVXABX", "thread": "Thread-18710", "timestamp": "2017-05-12T10:20:30.444" } Standardized keys
  61. { "account": "axelfontaine", "image": "axelfontaine/xyz:543", "instance": "i-0d843d5af9b366a69", "level": "INFO", "logger":

    "com.myapp.task.TaskService", "message": "Successfully killed axelfontaine/demo in prod", "request": "crq-7R2CVPUMKREUFLMQUE3XB7JWCX", "session": "cli-CRFM2IPABRFUJD7KTDYVDVXABX", "thread": "Thread-18710", "timestamp": "2017-05-12T10:20:30.444" } Standardized keys
  62. { "account": "axelfontaine", "image": "axelfontaine/xyz:543", "instance": "i-0d843d5af9b366a69", "level": "INFO", "logger":

    "com.myapp.task.TaskService", "message": "Successfully killed axelfontaine/demo in prod", "request": "crq-7R2CVPUMKREUFLMQUE3XB7JWCX", "session": "cli-CRFM2IPABRFUJD7KTDYVDVXABX", "thread": "Thread-18710", "timestamp": "2017-05-12T10:20:30.444" } Standardized values
  63. { "account": "axelfontaine", "image": "axelfontaine/xyz:543", "instance": "i-0d843d5af9b366a69", "level": "INFO", "logger":

    "com.myapp.task.TaskService", "message": "Successfully killed axelfontaine/demo in prod", "request": "crq-7R2CVPUMKREUFLMQUE3XB7JWCX", "session": "cli-CRFM2IPABRFUJD7KTDYVDVXABX", "thread": "Thread-18710", "timestamp": "2017-05-12T10:20:30.444" } Standardized values
  64. Summary ✓ Send your logs to a centralized service ✓

    Buy, don't build ✓ Ensure your logs are structured ✓ Standardize keys and values ✓ Query your logs to answer the what, who, where, when questions
  65. boxfuse.com Continuous Deployment as a Service for JVM, Node.js and

    Go apps on AWS ✓ Up and running in minutes ✓ Deploy with 1 command ✓ Focus on development ✓ Immutable Infrastructure as Code ✓ Minimal images ✓ Zero downtime blue/green deployments boxfuse run my-java-app.jar –env=prod
  66. flywaydb.org Evolve your relational database schemas reliably across all your

    environments for each of your modules and services with pleasure and plain SQL ✓ Supports all popular RDBMS ✓ Millions of users ✓ Designed for Continuous Delivery ✓ Open-source Community Edition and commercial Pro and Enterprise Editions ✓ Highly focused and very easy to get started
  67. Thanks ! @axelfontaine boxfuse.com flywaydb.org