Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Solve your problems with monitoring

Solve your problems with monitoring

Uygulamanızın kalbini dinleyin, WTK Izmir 2014

Osman

May 17, 2014
Tweet

More Decks by Osman

Other Decks in Programming

Transcript

  1. Fixing problems is difficult without logs and monitoring ! Sleep

    better by automating monitoring for your app
  2. Problems • Can the users hit my pages? • What

    is the page load time? • Problems with browsers, client side errors? • Third party integrations runs smoothly?
  3. Problems • How is my audience? Increased? • How is

    my revenue? • Is user orders shipping? • Functionality of application work at all?
  4. Problems • I didn't change the code, something wonky? •

    Which part of system is not working? • Is my server over capacity? • Do i need to scale up / down my servers?
  5. Problems • Is my (RDBMS|Caching|MQ) is running healthy? • How

    about (Memory|Disk|CPU) usage of servers? • Is (Application Server|Web server|LoadBalancer) up? • What is (Bandwidth|Network) usage? Comparing with last weeks?
  6. if failed port 80 for 3 times within 5 cycles

    then alert if cpu is greater than 50% for 5 cycles then restart check file mydb with path /data/mydatabase.db if size > 1 GB then alert check filesystem rootfs with path / if space usage > 90% then alert check process nginx with pidfile /var/run/nginx.pid start program = "/etc/init.d/nginx start" stop program = "/etc/init.d/nginx stop" ! check host xyz with address xyz.org if failed icmp type echo count 5 with timeout 15 seconds then alert
  7. collectd ! http://collectd.org ! collectd is a daemon which collects

    system performance statistics periodically and provides mechanisms to store
  8. riemann ! http://riemann.io ! Riemann aggregates events from your servers

    and applications with a powerful stream processing
  9. munin ! http://munin-monitoring.org ! Easily monitor the performance of your

    computers, networks, applications, comes with dashboard
  10. statsd ! https://github.com/etsy/statsd/ ! A network daemon that runs on

    the Node.js platform and listens for statistics, like counters and timers, sent over
  11. metrics ! http://metrics.codahale.com ! Metrics provides a powerful toolkit of

    ways to measure the behavior of critical components in your production
  12. logstash ! http://logstash.net ! Tool for managing events and logs.

    You can use it to collect logs, parse them, and store, comes with
  13. What we learned • Availability is too important • Monitoring

    is a must • It helps to troubleshooting easy • There is too much solutions available