How to monitor your Symfony application

How to monitor your Symfony applications Alexandre Salomé

About me • The “Poney” guy • Architect on the
back-office software, Auchan Retail France • 7 years of experience on Symfony • Developer since my childhood

About you

Summary • A little theory • Overview of existing solutions
• What to monitor • Alerting • Our solution at Auchan Retail France

A little theory

Metrics vs Events

Metrics • Numbers that change over time • Formerly time-series
data • name + time = value

Metrics: Examples • System metrics ◦ Load average ◦ RAM
usage ◦ Disk I/O • Service metrics ◦ Number of SQL queries ◦ Cache hits and misses • Application metrics ◦ Number of registrations ◦ Page generation duration

Metrics: Aggregation • Important when querying • Different aggregations ◦
Sum ◦ Average ◦ Max ◦ Min ◦ 90th percentile • Can be used to reduce storage size ◦ Every minute from now to 1 week ago ◦ Every 15 minutes from 1 week ago to 1 month ago ◦ Every hour from 1 month ago to 6 months ago ◦ Every day from 6 month ago to …

Metrics: Deviation Some metrics are pushed as growing numbers: •
MySQL query count • Network transfer To get the rate, you need to compute the deviation :

Metrics vs Events

Events An event is a message from an application, your
system, or service. It’s in a text format: 2016-09-06 20:47:13 - Alexandre is preparing slides for the conference

Events: examples • Linux logs • Apache or Nginx logs
• Symfony logs • MySQL logs • Slow query logs

Events: field extraction Parse messages with a regex : 2016-01-14
12:34:32 boston-01: User “alice” connected to the application from IP 12.34.56.78 Get a data table: Date 2016-01-14 Time 12:34:32 Server boston-01 Event type login Username alice IP 12.34.56.78

Events: field extraction • Logstash provides a lot of built-in
regular expressions : Example: parsing of Apache/Nginx logs grok { match => { "message" => "%{COMBINEDAPACHELOG}" } }

Metrics and Events > Comparison Metrics are good at: •
Time series data • Consolidation over time • Mathematics • Storage size Numbers Events are good at: • Storing any message in any format • Extracting fields from messages for indexation and queries Text

Overview of existing solutions

Metrics storages • ++ Graphite : aggregation • + InfluxDB
: clustering • OpenTSDB : scalable • Promotheus

Metrics from your system and services A good solution: collectd
• Plugins for system and services metrics : CPU usage, RAM, load average, network, MySQL, Apache, AMQP, Carbon, CPU Temperature, Filesystem, Disk, IRQ, NFS, PostgreSQL, Syslog, MongoDB, Redis, File count, … • You can add custom metrics by using the Exec plugin Sends all metrics to your storage

Metrics from your system and services A complete solution: Zabbix
• Agents to collect metrics • Web UI to get realtime alerting • Alert by Mail/SMS/Anything • Complete metrics extraction ◦ System metrics ◦ Service metrics ◦ Remote calls

Metrics buffering with StatsD • A Node.JS application to buffer
your metrics flow • Lot of available backends • Manage different metric types ◦ Counters (+1, +3, +2) ◦ Sampling ( ◦ Gauges (200, +3, -2) • A very simple UDP protocol • Flush metrics every X seconds • Optimize performance

Metrics from your Symfony application • <3 m6web/statsd-bundle <3 •
algatux/influxdb-bundle • https://packagist.org/search/?q=<your-solution>-bundle

Events: the so-famous ELK • ElasticSearch is the storage •
LogStash is the log processing tool • Kibana is the dashboard

See also Events storage: • Graylog • Fluentd Awesome Sysadmin
: https://github.com/kahun/awesome-sysadmin

A simple start

A simple start: metrics

A simple start: events

A good fail • Metrics are events • 1 hour
= 10 MB • 1 day = 200 MB • 1 week = 1.5 GB

What to monitor

Anything that changes can be measured • Measure anything and
everything • 3 levels: ◦ System: the Debian/Archlinux/whatever system you are using ◦ Services: Apache, MySQL, Docker, Nginx, Redis, … ◦ Applications: your Symfony application How to Measure Anything: Finding the Value of Intangibles in Business By Douglas W. Hubbard

System Metrics • Load average • RAM • Free disk
• IOWait • Network usage • Inodes Events • System logs

Services Metrics • MySQL ◦ Query count ◦ Cache hit/miss
• Apache ◦ Query count ◦ Busy/idle workers • HAProxy • Redis • …. Events • Apache|Nginx access logs • Apache|Nginx error logs • MySQL logs • ElasticSearch logs • ...

Applications Metrics • Memory/duration per route • Feature usage •
Custom metrics ◦ Registration ◦ Checkout process Events • Symfony logs • Custom logs ◦ Registration GeoIP ◦ Checkout details ◦ Feature details

Application: generic measures for your application http://bit.ly/2ciZDLI

Application metrics • Use application events ◦ Don’t couple your
application code to your monitoring • M6Web/StatsdBundle provides a smart way to achieve this: m6_statsd: clients: default: Events: forum.read: increment : mysite.forum.read

Application events • Use Symfony monolog channels to route your
messages and create powerful dashboard • Example: the deprecated channel for deprecation message

Deprecated channel http://bit.ly/2c8oWpk

Deprecated channel

What to measure • System and service performance ◦ Load
average ◦ Free disk • System and service errors ◦ Syslog errors ◦ HTTP codes >= 500 • User behavior ◦ Feature usage ◦ Registration count ◦ Page views

Alerting

A little note on alerting • It’s nice to measure,
it’s better to be alerted • Define rules and get notified when a rule is violated • Don’t put thresolds at 95% : if your filesystem is filled at 95%, your system is probably already suffering ◦ Prefer 60% • Handling the problem before it happens avoids recovering over a crash • The alerting rules can be complex ◦ On work hours, send a mail to the team ◦ Otherwise, send an SMS to the IT manager phone ◦ If the IT manager is on holidays, send to his backup

Grafana alerting • Since version 3.1.0 • By now, only
support Graphite backend

Our experience at Auchan Retail France

Our stack • Splunk for events • Zabbix for metrics

Zabbix • Used for monitoring and alerting of system/service metrics

Splunk • ELK + Cash effect = Splunk • The
whole company can use it • On-the-fly field extraction ◦ Beautiful interface to configure them • Powerful expression language: ◦ index=apache sourcetype=frontend | timechart count BY host ◦ index=apache sourcetype=frontend host=auchan.fr | stats avg(response_time) BY path • Powerful graph constructor • Data models → Pivot tables for business

Conclusion • Track everything that changes • Instrumentalize your application
• Track your critical business features • Create decisional dashboards • Alert at 60%, not at 95% • If you have (lot of) money, take Splunk

The end Thank you!

Questions & Answers

Photos credits • Andrew Malone - Measuring - https://flic.kr/p/aqhCH8 •
Sebastian Schulze - SymfonyLive 2010 - https://flic.kr/p/7Ef7vx • KimManleyOrt - At the Math Grad House - https://flic.kr/p/m2UBWH • Usehung - Chemistry - https://flic.kr/p/4uT7Er • Cybjorg - Gauges - https://flic.kr/p/5r3LuJ • Shan Ambrose - alert - https://flic.kr/p/cAk4KC • Nicolas Buffler - Projet 365 - 209/365 - https://flic.kr/p/mkHfLF • Derek Bridges - Questions - https://flic.kr/p/5DeuzB • Poneys - Internet

How to monitor your Symfony application

How to monitor your Symfony application

More Decks by Alexandre Salomé

Other Decks in Programming

Featured

Transcript