Upgrade to Pro — share decks privately, control downloads, hide ads and more …

For The Love of Logs

For The Love of Logs

Logs: they're just simple lines of text, right? How is something so small and seemingly uncomplicated actually so important and...complicated? Whether your logs are still hanging out on your hosts without centralization or you currently manage an ELK stack, this talk will explain why you should be as obsessed about logs as I am.

Aly Fulton

August 22, 2019
Tweet

More Decks by Aly Fulton

Other Decks in Programming

Transcript

  1. FOR THE LOVE OF LOGS // @SINTHETIX ACCESS LOGS ▸

    Example: nginx and access.log. ▸ 127.0.0.1 - Aly [22/Aug/2019:12:34:56 +0000] "GET / HTTP/1.1" 404 123 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:47.0) Gecko/20100101 Firefox/47.0" ▸ List of requests made and the response. ▸ Can tell information about who or what is accessing your services.
  2. FOR THE LOVE OF LOGS // @SINTHETIX ERROR LOGS ▸

    Filled with errors and warnings when services encounter trouble.
 
 
 
 ▸ Error and Warning are also log levels (we’ll get into that later).
  3. FOR THE LOVE OF LOGS // @SINTHETIX DEBUG LOGS ▸

    Information about events, state, errors useful for debugging. ▸ Can be user generated temporarily. ▸ console.log("yolo") ▸ Debug is also a log level.
  4. FOR THE LOVE OF LOGS // @SINTHETIX API LOGS ▸

    Information about request and response for API calls. source: https://docs.apigee.com/api-monitoring/logs-api
  5. FOR THE LOVE OF LOGS // @SINTHETIX SYSTEM LOGS ▸

    Operating system events like system changes, startups, errors, etc.
  6. FOR THE LOVE OF LOGS // @SINTHETIX AUDIT LOGS ▸

    Change log consisting of who, what, and when.
  7. FOR THE LOVE OF LOGS // @SINTHETIX FATAL OR CRITICAL

    ▸ The worst of the worst. ▸ Broke something majorly. ▸ Needs human intervention.
  8. FOR THE LOVE OF LOGS // @SINTHETIX ERROR ▸ Broke

    something pretty bad. ▸ Might need human intervention.
  9. FOR THE LOVE OF LOGS // @SINTHETIX WARNING ▸ Something

    that might break in the near future. ▸ Doesn’t need a human right now.
  10. FOR THE LOVE OF LOGS // @SINTHETIX INFO ▸ Normal

    application and infrastructure behavior. ▸ Normal user events and system interactions. ▸ Humans might want to analyze this information.
  11. FOR THE LOVE OF LOGS // @SINTHETIX DEBUG ▸ Extra

    details and diagnostic information.
  12. FOR THE LOVE OF LOGS // @SINTHETIX TRACE ▸ Start

    and endpoints to follow requests throughout services.
  13. LOG

  14. FOR THE LOVE OF LOGS // @SINTHETIX SOFTWARE ENGINEERS ▸

    Identify and replicate problem behavior. ▸ Debug software. ▸ Understand system behavior.
  15. FOR THE LOVE OF LOGS // @SINTHETIX SITE RELIABILITY/INFRASTRUCTURE ENGINEERS

    ▸ Audit changes to environments. ▸ Debug server failures. ▸ Discover location of service failures in distributed systems. ▸ Utilize data for planning capacity or caches.
  16. FOR THE LOVE OF LOGS // @SINTHETIX SUPPORT ENGINEERS ▸

    Understand customer behavior. ▸ Determine if user or service error.
  17. FOR THE LOVE OF LOGS // @SINTHETIX SECURITY AND COMPLIANCE

    ▸ Understand malicious behavior. ▸ Monitor both internal and external users. ▸ Proof of compliance.
  18. FOR THE LOVE OF LOGS // @SINTHETIX AND MORE ▸

    Log data can be used for sales, marketing, product, and other business areas. ▸ Understand users and how they use the services.
  19. FOR THE LOVE OF LOGS // @SINTHETIX CENTRALIZATION ▸ One

    location houses entirety of logs. ▸ Much easier than tailing individual log files on different servers. ▸ Utilize context (hosts, time) to discern patterns.
  20. FOR THE LOVE OF LOGS // @SINTHETIX STANDARDS AND CONSISTENCY

    ▸ Determine formatting standard to use such as Common Log Format. ▸ Transform all logs to match, giving consistent format and data.

  21. FOR THE LOVE OF LOGS // @SINTHETIX FORMATTING ▸ JSON

    easier for software to parse but takes more space. ▸ Strings easier for humans to read. ▸ Use key/value pairs. STRING:
 ts=2018-02-20T22:48:11.291815Z lvl=info msg="InfluxDB starting" version=unknown branch=unknown commit=unknown
 ts=2018-02-20T22:48:11.291858Z lvl=info msg="Go runtime" version=go1.10 maxprocs=8
 ts=2018-02-20T22:48:11.291875Z lvl=info msg="Loading configuration file" path=/Users/user_name/.influxdb/influxdb.conf JSON:
 {"lvl":"info","ts":"2018-02-20T22:46:35Z","msg":"InfluxDB starting, version unknown, branch unknown, commit unknown"} {"lvl":"info","ts":"2018-02-20T22:46:35Z","msg":"Go version go1.10, GOMAXPROCS set to 8"} {"lvl":"info","ts":"2018-02-20T22:46:35Z","msg":"Using configuration at: /Users/user_name/.influxdb/influxdb.conf"} source: https://docs.influxdata.com/influxdb/v1.7/administration/logs/
  22. FOR THE LOVE OF LOGS // @SINTHETIX LOG LEVELS ▸

    Use appropriate log levels. ▸ Your application failing to start is not “info.” ▸ Your application responding with a 200 OK is not an error.
  23. FOR THE LOVE OF LOGS // @SINTHETIX IDENTIFICATION ▸ Thread

    logs with a request identifier key. ▸ Helps follow flow of one request through large system. ▸ Use user identifiers.
  24. FOR THE LOVE OF LOGS // @SINTHETIX CONTEXT Images from

    the Kibana guide: https://www.elastic.co/guide/en/kibana/current/index.html
  25. FOR THE LOVE OF LOGS // @SINTHETIX COVERAGE ▸ If

    you notice missing log coverage, add it. ▸ Be mindful of logging when adding new features.
  26. FOR THE LOVE OF LOGS // @SINTHETIX USEFUL ▸ Be

    mindful of noise. ▸ If it serves no purpose, fix it or remove it.
  27. FOR THE LOVE OF LOGS // @SINTHETIX SECURITY AND COMPLIANCE

    ▸ Log actions for audits. ▸ Redact sensitive information (passwords, credit card numbers). ▸ Pay attention to regulations (HIPAA, GDPR).
  28. FOR THE LOVE OF LOGS // @SINTHETIX STORAGE, ROTATION, AND

    RETENTION ▸ Logs take up space. ▸ Log rotation necessary to relieve space on servers. ▸ Retention important for analysis of trends. ▸ Different log levels might have different retention policies. ▸ Retention and storage may depend on compliance regulations.
  29. FOR THE LOVE OF LOGS // @SINTHETIX AN IMPORTANT THING

    TO NOTE ▸ Log centralization is internal tooling ▸ Internal tooling needs to match internal users’ needs so they actually use it ▸ Successful solutions look different for different teams
  30. FOR THE LOVE OF LOGS // @SINTHETIX COMPANY ONE ▸

    Had logs just sitting on server. ▸ Needed to log in to server and tail/grep individually. ▸ Not really operations/“DevOps” minded.
  31. FOR THE LOVE OF LOGS // @SINTHETIX COMPANY ONE ▸

    Used agent-based, third party logging system (LogDNA). ▸ Easy to use for everyone. ▸ Graphed nginx response codes from logs. ▸ Detected denial of service attacks based on spikes in 429s and 500s.
  32. FOR THE LOVE OF LOGS // @SINTHETIX COMPANY TWO ▸

    Had log centralization already (Papertrail). ▸ Didn’t have much for audit logs.
  33. FOR THE LOVE OF LOGS // @SINTHETIX COMPANY TWO ▸

    Added auditing around user actions. ▸ Bug discovered that could have lead to information breach. ▸ Access/audit logs revealed no information was breached.
  34. FOR THE LOVE OF LOGS // @SINTHETIX COMPANY THREE ▸

    Elastic stack with Redis buffer and S3 archiving. ▸ Self-hosted and part of resource-strained Kubernetes clusters. ▸ High log volume.
  35. FOR THE LOVE OF LOGS // @SINTHETIX COMPANY THREE ▸

    Took longer to send logs to internal Elasticsearch than external S3. ▸ Short retention accessible to Kibana. ▸ Logstash wasn’t completely configured.
  36. FOR THE LOVE OF LOGS // @SINTHETIX COMPANY THREE ▸

    Missing log coverage. ▸ Unhelpful/incorrect log lines. ▸ Incorrect logging levels. ▸ Request identifiers not threaded throughout entire request.
  37. FOR THE LOVE OF LOGS // @SINTHETIX COMPANY THREE ▸

    Logging infrastructure needed to be better scaled. ▸ Internally or through third party, depends on ROI.
  38. FOR THE LOVE OF LOGS // @SINTHETIX COMPANY THREE ▸

    Talked to support, security, engineers about their logging needs. ▸ Wanted to develop a standard to work towards gradually.
  39. FOR THE LOVE OF LOGS // @SINTHETIX COMPANY THREE ▸

    Implementation: application logger, logging infrastructure, retention time. ▸ Formatting: defined expected keys, including request ID. ▸ Parsing: standardizing key names.