Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Logging in the age of Microservices and the Cloud

Axel Fontaine
September 13, 2018

Logging in the age of Microservices and the Cloud

The days of the statically partitioned datacenter are over. Welcome to the modern world of microservices and auto-scaling in the cloud. Requests flow through multiple services, individual services are auto-scaled and machines are short-lived.

This is a brave new world and it is time to change the way we design and architect our software to better deal with it. In this talk we'll look at logging and we'll take a deep dive into the challenges involved into moving from the old "SSH and tail -f" world to a world of centralized and structured logs, consumable both by humans and machines.

This session is for developers and architects looking for battle-tested solutions to implement effective logging for microservices in an auto-scaling world.

Axel Fontaine

September 13, 2018
Tweet

More Decks by Axel Fontaine

Other Decks in Technology

Transcript

  1. Why are we logging? Postmortem analysis of user activity and

    programming errors Powerful debugging tool Should contain answers to important questions: What? Who? Where? When?
  2. What? Message, Code, Severity Who? Account, User, Session, Request Where?

    App, Module, Class When? Timestamp, Hostname, PID, Thread How can these questions be asked? How can all this information be captured?
  3. logger.info(“my log message”); What? Message, Code, Severity Who? Account, User,

    Session, Request Where? App, Module, Class When? Timestamp, Hostname, PID, Thread
  4. logger.info(“my log message”); What? Message, Code, Severity Who? Account, User,

    Session, Request Where? App, Module, Class When? Timestamp, Hostname, PID, Thread
  5. logger.info(“my log message”); What? Message, Code, Severity Who? Account, User,

    Session, Request Where? App, Module, Class When? Timestamp, Hostname, PID, Thread
  6. Your Code Logger Appender A Appender B Storage B Storage

    A MDC Mapped Diagnostic Context (Thread-local temporary key-value store)
  7. MDC.put(“account”, “company ABC”); MDC.put(“user”, “user123”); What? Message, Code, Severity Who?

    Account, User, Session, Request Where? App, Module, Class When? Timestamp, Hostname, PID, Thread
  8. MDC.put(“account”, “company ABC”); MDC.put(“user”, “user123”); … logger.info(“my log message”); What?

    Message, Code, Severity Who? Account, User, Session, Request Where? App, Module, Class When? Timestamp, Hostname, PID, Thread
  9. MDC.put(“account”, “company ABC”); MDC.put(“user”, “user123”); Populate when: ✓ a request

    enters the application ✓ a message is received from a queue ✓ a cron task starts ✓ making an async call to another thread And don’t forget to clear when done! (Threadpools reuse threads!)
  10. Your Code Logger Appender Storage (formatted) MDC Log Viewer FORMAT

    READ Decoupling log storage from log representation
  11. { "account": "axelfontaine", "image": "axelfontaine/xyz:543", "instance": "i-0d843d5af9b366a69", "level": "INFO", "logger":

    "com.myapp.task.TaskService", "message": "Successfully killed axelfontaine/demo in prod", "request": "crq-7R2CVPUMKREUFLMQUE3XB7JWCX", "session": "cli-CRFM2IPABRFUJD7KTDYVDVXABX", "thread": "Thread-18710", "timestamp": "2017-05-12T10:20:30.444" } Structured logging
  12. Spare Capacity (paying for something you don’t use) = Wasted

    Money https://www.flickr.com/photos/timothykrause/5677858694/
  13. Load Balancer ssh me@myserver1 tail -f server.log ssh me@myserver2 tail

    -f server.log ssh me@myserver3 tail -f server.log LOG file LOG file LOG file CPU Load Scale Out Scale In
  14. Load Balancer ssh me@myserver1 tail -f server.log ssh me@myserver2 tail

    -f server.log ssh me@myserver3 tail -f server.log ssh me@myserver4 tail -f server.log LOG file LOG file LOG file LOG file Scale Out Scale In CPU Load
  15. Load Balancer ssh me@myserver1 tail -f server.log ssh me@myserver2 tail

    -f server.log ssh me@myserver3 tail -f server.log ssh me@myserver4 tail -f server.log LOG file LOG file LOG file LOG file Scale Out Scale In CPU Load
  16. Load Balancer ssh me@myserver1 tail -f server.log ssh me@myserver3 tail

    -f server.log ssh me@myserver4 tail -f server.log LOG file LOG file LOG file LOG file Scale Out Scale In CPU Load ssh me@myserver2 tail -f server.log
  17. Load Balancer ssh me@myserver1 tail -f server.log DATA LOSS ssh

    me@myserver3 tail -f server.log ssh me@myserver4 tail -f server.log LOG file LOG file LOG file LOG file Scale Out Scale In CPU Load
  18. Load Balancer LOG file LOG file LOG file log server

    where logs can be ✓ aggregated ✓ stored and backuped ✓ indexed ✓ searched
  19. log server where logs can be ✓ aggregated ✓ stored

    and backuped ✓ indexed ✓ searched Many options: • Logstash (ELK) • AWS CloudWatch Logs • Loggly • Papertrail • … Build or Buy? Almost always the better option, unless you have truly extreme requirements (you probably don't)
  20. or stdout Appender ✓ tightly integrated with logging framework ✓

    in-process ✓ direct MDC access ✓ best for homogenous environments ✓ universal ✓ separate process ✓ ingests serialized data with record separator ✓ best for heterogeneous environments
  21. Log Levels Importance You want both when an important failure

    occurs! Detail DEBUG INFO WARNING ERROR What is missing: High water mark filtering!
  22. POLL: what type of architecture does your software have? •

    Integrated (Monolith) • Distributed (Microservices)
  23. A B C Create MDC (based on session) and assign

    unique request ID Copy MDC to HTTP(S) headers Read MDC HTTP(S) headers Read MDC HTTP(S) headers Copy MDC to HTTP(S) headers Propagating MDC
  24. Propagating MDC A B C Filter Decorator Filter Filter Decorator

    Two Implementation Options: • library (manual, precise control) • agent (automatic, risk over overreaching)
  25. Machine-queryable logs What? Message, Code, Severity Who? Account, User, Session,

    Request Where? App, Module, Class When? Timestamp, Hostname, PID, Thread Machine-readable logs
  26. { "account": "axelfontaine", "image": "axelfontaine/xyz:543", "instance": "i-0d843d5af9b366a69", "level": "INFO", "logger":

    "com.myapp.task.TaskService", "message": "Successfully killed axelfontaine/demo in prod", "request": "crq-7R2CVPUMKREUFLMQUE3XB7JWCX", "session": "cli-CRFM2IPABRFUJD7KTDYVDVXABX", "thread": "Thread-18710", "timestamp": "2017-05-12T10:20:30.444" } Standardized keys
  27. { "account": "axelfontaine", "image": "axelfontaine/xyz:543", "instance": "i-0d843d5af9b366a69", "level": "INFO", "logger":

    "com.myapp.task.TaskService", "message": "Successfully killed axelfontaine/demo in prod", "request": "crq-7R2CVPUMKREUFLMQUE3XB7JWCX", "session": "cli-CRFM2IPABRFUJD7KTDYVDVXABX", "thread": "Thread-18710", "timestamp": "2017-05-12T10:20:30.444" } Standardized keys
  28. { "account": "axelfontaine", "image": "axelfontaine/xyz:543", "instance": "i-0d843d5af9b366a69", "level": "INFO", "logger":

    "com.myapp.task.TaskService", "message": "Successfully killed axelfontaine/demo in prod", "request": "crq-7R2CVPUMKREUFLMQUE3XB7JWCX", "session": "cli-CRFM2IPABRFUJD7KTDYVDVXABX", "thread": "Thread-18710", "timestamp": "2017-05-12T10:20:30.444" } Standardized values
  29. { "account": "axelfontaine", "image": "axelfontaine/xyz:543", "instance": "i-0d843d5af9b366a69", "level": "INFO", "logger":

    "com.myapp.task.TaskService", "message": "Successfully killed axelfontaine/demo in prod", "request": "crq-7R2CVPUMKREUFLMQUE3XB7JWCX", "session": "cli-CRFM2IPABRFUJD7KTDYVDVXABX", "thread": "Thread-18710", "timestamp": "2017-05-12T10:20:30.444" } Standardized values
  30. Summary ✓ Send your logs to a centralized service ✓

    Buy, don't build ✓ Ensure your logs are structured ✓ Standardize keys and values ✓ Query your logs to answer the what, who, where, when questions
  31. boxfuse.com Continuous Deployment as a Service for JVM, Node.js and

    Go apps on AWS ✓ Up and running in minutes ✓ Deploy with 1 command ✓ Focus on development ✓ Immutable Infrastructure as Code ✓ Minimal images ✓ Zero downtime blue/green deployments boxfuse run my-java-app.jar –env=prod
  32. flywaydb.org Evolve your relational database schemas reliably across all your

    environments for each of your modules and services with pleasure and plain SQL ✓ Supports all popular RDBMS ✓ Millions of users ✓ Designed for Continuous Delivery ✓ Open-source Community Edition and commercial Pro and Enterprise Editions ✓ Highly focused and very easy to get started