$30 off During Our Annual Pro Sale. View Details »

How to monitor your Symfony application

How to monitor your Symfony application

Alexandre Salomé

September 16, 2016
Tweet

More Decks by Alexandre Salomé

Other Decks in Programming

Transcript

  1. How to monitor your Symfony applications
    Alexandre Salomé

    View Slide

  2. About me
    ● The “Poney” guy
    ● Architect on the back-office software, Auchan Retail France
    ● 7 years of experience on Symfony
    ● Developer since my childhood

    View Slide

  3. About you

    View Slide

  4. Summary
    ● A little theory
    ● Overview of existing solutions
    ● What to monitor
    ● Alerting
    ● Our solution at Auchan Retail France

    View Slide

  5. A little theory

    View Slide

  6. Metrics vs Events

    View Slide

  7. Metrics vs Events

    View Slide

  8. Metrics
    ● Numbers that change over time
    ● Formerly time-series data
    ● name + time = value

    View Slide

  9. Metrics: Examples
    ● System metrics
    ○ Load average
    ○ RAM usage
    ○ Disk I/O
    ● Service metrics
    ○ Number of SQL queries
    ○ Cache hits and misses
    ● Application metrics
    ○ Number of registrations
    ○ Page generation duration

    View Slide

  10. Metrics: Aggregation
    ● Important when querying
    ● Different aggregations
    ○ Sum
    ○ Average
    ○ Max
    ○ Min
    ○ 90th percentile
    ● Can be used to reduce storage size
    ○ Every minute from now to 1 week ago
    ○ Every 15 minutes from 1 week ago to 1 month ago
    ○ Every hour from 1 month ago to 6 months ago
    ○ Every day from 6 month ago to …

    View Slide

  11. Metrics: Deviation
    Some metrics are pushed as growing numbers:
    ● MySQL query count
    ● Network transfer
    To get the rate, you need to compute the deviation :

    View Slide

  12. Metrics vs Events

    View Slide

  13. Events
    An event is a message from an application, your system, or service.
    It’s in a text format:
    2016-09-06 20:47:13 - Alexandre is preparing slides for the conference

    View Slide

  14. Events: examples
    ● Linux logs
    ● Apache or Nginx logs
    ● Symfony logs
    ● MySQL logs
    ● Slow query logs

    View Slide

  15. Events: field extraction
    Parse messages with a regex :
    2016-01-14 12:34:32 boston-01: User “alice” connected to the application from IP 12.34.56.78
    Get a data table:
    Date 2016-01-14
    Time 12:34:32
    Server boston-01
    Event type login
    Username alice
    IP 12.34.56.78

    View Slide

  16. Events: field extraction
    ● Logstash provides a lot of built-in regular expressions :
    Example: parsing of Apache/Nginx logs
    grok {
    match => { "message" => "%{COMBINEDAPACHELOG}" }
    }

    View Slide

  17. Metrics and Events > Comparison
    Metrics are good at:
    ● Time series data
    ● Consolidation over time
    ● Mathematics
    ● Storage size
    Numbers
    Events are good at:
    ● Storing any message in any format
    ● Extracting fields from messages for
    indexation and queries
    Text

    View Slide

  18. Overview of existing solutions

    View Slide

  19. Metrics storages
    ● ++ Graphite : aggregation
    ● + InfluxDB : clustering
    ● OpenTSDB : scalable
    ● Promotheus

    View Slide

  20. Metrics from your system and services
    A good solution: collectd
    ● Plugins for system and services metrics : CPU usage, RAM, load average,
    network, MySQL, Apache, AMQP, Carbon, CPU Temperature, Filesystem,
    Disk, IRQ, NFS, PostgreSQL, Syslog, MongoDB, Redis, File count, …
    ● You can add custom metrics by using the Exec plugin
    Sends all metrics to your storage

    View Slide

  21. Metrics from your system and services
    A complete solution: Zabbix
    ● Agents to collect metrics
    ● Web UI to get realtime alerting
    ● Alert by Mail/SMS/Anything
    ● Complete metrics extraction
    ○ System metrics
    ○ Service metrics
    ○ Remote calls

    View Slide

  22. Metrics buffering with StatsD
    ● A Node.JS application to buffer your metrics flow
    ● Lot of available backends
    ● Manage different metric types
    ○ Counters (+1, +3, +2)
    ○ Sampling (
    ○ Gauges (200, +3, -2)
    ● A very simple UDP protocol
    ● Flush metrics every X seconds
    ● Optimize performance

    View Slide

  23. Metrics from your Symfony application
    ● <3 m6web/statsd-bundle <3
    ● algatux/influxdb-bundle
    ● https://packagist.org/search/?q=-bundle

    View Slide

  24. Events: the so-famous ELK
    ● ElasticSearch is the storage
    ● LogStash is the log processing tool
    ● Kibana is the dashboard

    View Slide

  25. See also
    Events storage:
    ● Graylog
    ● Fluentd
    Awesome Sysadmin :
    https://github.com/kahun/awesome-sysadmin

    View Slide

  26. A simple start

    View Slide

  27. A simple start: metrics

    View Slide

  28. A simple start: events

    View Slide

  29. A good fail
    ● Metrics are events
    ● 1 hour = 10 MB
    ● 1 day = 200 MB
    ● 1 week = 1.5 GB

    View Slide

  30. What to monitor

    View Slide

  31. Anything that changes can be measured
    ● Measure anything and everything
    ● 3 levels:
    ○ System: the Debian/Archlinux/whatever system you are using
    ○ Services: Apache, MySQL, Docker, Nginx, Redis, …
    ○ Applications: your Symfony application
    How to Measure Anything: Finding the Value of Intangibles in Business
    By Douglas W. Hubbard

    View Slide

  32. System
    Metrics
    ● Load average
    ● RAM
    ● Free disk
    ● IOWait
    ● Network usage
    ● Inodes
    Events
    ● System logs

    View Slide

  33. Services
    Metrics
    ● MySQL
    ○ Query count
    ○ Cache hit/miss
    ● Apache
    ○ Query count
    ○ Busy/idle workers
    ● HAProxy
    ● Redis
    ● ….
    Events
    ● Apache|Nginx access logs
    ● Apache|Nginx error logs
    ● MySQL logs
    ● ElasticSearch logs
    ● ...

    View Slide

  34. Applications
    Metrics
    ● Memory/duration per route
    ● Feature usage
    ● Custom metrics
    ○ Registration
    ○ Checkout process
    Events
    ● Symfony logs
    ● Custom logs
    ○ Registration GeoIP
    ○ Checkout details
    ○ Feature details

    View Slide

  35. Application: generic measures for your application
    http://bit.ly/2ciZDLI

    View Slide

  36. Application: generic measures for your application
    http://bit.ly/2ciZDLI

    View Slide

  37. Application: generic measures for your application
    http://bit.ly/2ciZDLI

    View Slide

  38. Application metrics
    ● Use application events
    ○ Don’t couple your application code to your monitoring
    ● M6Web/StatsdBundle provides a smart way to achieve this:
    m6_statsd:
    clients:
    default:
    Events:
    forum.read:
    increment : mysite.forum.read

    View Slide

  39. Application events
    ● Use Symfony monolog channels to route your messages and create powerful
    dashboard
    ● Example: the deprecated channel for deprecation message

    View Slide

  40. Deprecated channel
    http://bit.ly/2c8oWpk

    View Slide

  41. Deprecated channel
    http://bit.ly/2c8oWpk

    View Slide

  42. Deprecated channel

    View Slide

  43. What to measure
    ● System and service performance
    ○ Load average
    ○ Free disk
    ● System and service errors
    ○ Syslog errors
    ○ HTTP codes >= 500
    ● User behavior
    ○ Feature usage
    ○ Registration count
    ○ Page views

    View Slide

  44. Alerting

    View Slide

  45. A little note on alerting
    ● It’s nice to measure, it’s better to be alerted
    ● Define rules and get notified when a rule is violated
    ● Don’t put thresolds at 95% : if your filesystem is filled at 95%, your system
    is probably already suffering
    ○ Prefer 60%
    ● Handling the problem before it happens avoids recovering over a crash
    ● The alerting rules can be complex
    ○ On work hours, send a mail to the team
    ○ Otherwise, send an SMS to the IT manager phone
    ○ If the IT manager is on holidays, send to his backup

    View Slide

  46. Grafana alerting
    ● Since version 3.1.0
    ● By now, only support Graphite backend

    View Slide

  47. Our experience at Auchan Retail France

    View Slide

  48. Our stack
    ● Splunk for events
    ● Zabbix for metrics

    View Slide

  49. Zabbix
    ● Used for monitoring and alerting of system/service metrics

    View Slide

  50. Splunk
    ● ELK + Cash effect = Splunk
    ● The whole company can use it
    ● On-the-fly field extraction
    ○ Beautiful interface to configure them
    ● Powerful expression language:
    ○ index=apache sourcetype=frontend | timechart count BY host
    ○ index=apache sourcetype=frontend host=auchan.fr | stats avg(response_time) BY path
    ● Powerful graph constructor
    ● Data models → Pivot tables for business

    View Slide

  51. Conclusion
    ● Track everything that changes
    ● Instrumentalize your application
    ● Track your critical business features
    ● Create decisional dashboards
    ● Alert at 60%, not at 95%
    ● If you have (lot of) money, take Splunk

    View Slide

  52. The end
    Thank you!

    View Slide

  53. Questions & Answers

    View Slide

  54. Photos credits
    ● Andrew Malone - Measuring - https://flic.kr/p/aqhCH8
    ● Sebastian Schulze - SymfonyLive 2010 - https://flic.kr/p/7Ef7vx
    ● KimManleyOrt - At the Math Grad House - https://flic.kr/p/m2UBWH
    ● Usehung - Chemistry - https://flic.kr/p/4uT7Er
    ● Cybjorg - Gauges - https://flic.kr/p/5r3LuJ
    ● Shan Ambrose - alert - https://flic.kr/p/cAk4KC
    ● Nicolas Buffler - Projet 365 - 209/365 - https://flic.kr/p/mkHfLF
    ● Derek Bridges - Questions - https://flic.kr/p/5DeuzB
    ● Poneys - Internet

    View Slide