Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Monit - unix monitoring utility

Mann
May 09, 2013

Monit - unix monitoring utility

Mann

May 09, 2013
Tweet

More Decks by Mann

Other Decks in Technology

Transcript

  1. Problem - Don't know when your system is going to

    failed If it failed - You need to get it up and running ASAP
  2. Monit: What & Why - Automatically monitor your system -

    It is lightweight - Easy to install - Very stable - No require another separate management server - Has a great configuration syntax
  3. What to monitor - Disk space - Memory usage -

    CPU consumption - Process - Host - Port - File / Directory
  4. Check system check system localhost.localdomain if loadavg (1min) > 8

    then alert if loadavg (5min) > 5 then alert if memory usage > 75% then alert if swap usage > 25% then alert if cpu usage (user) > 70% then alert if cpu usage (system) > 30% then alert if cpu usage (wait) > 20% then alert
  5. Check process check process apache with pidfile /var/run/apache2.pid group www

    start "/etc/init.d/apache2 start" stop "/etc/init.d/apache2 stop" if cpu > 60% for 2 cycles then alert if cpu > 80% for 25 cycles then restart if totalmem > 2100.0 MB for 5 cycles then restart if loadavg (5min) > 10 for 8 cycles then stop if failed port 80 then restart if 5 restarts within 5 cycles then timeout
  6. Check port check process mysql with pidfile /var/run/mysqld/mysqld.pid group db

    start "/etc/init.d/mysql start" stop "/etc/init.d/mysql stop" if failed port 3306 protocol mysql then restart if 5 restarts within 5 cycles then timeout
  7. Check host, ping & port check host my_example with address

    example.com group network if failed icmp type echo with timeout 15 seconds then alert if failed port 5060 type udp protocol sip then alert
  8. Matching process name check process verboice with matching "ruby lib/services/broker.rb"

    group app start "/sbin/start verboice" stop "/sbin/stop verboice" if cpu > 60% for 2 cycles then alert if cpu > 80% for 15 cycles then restart if totalmem > 1800.0 MB for 5 cycles then restart if loadavg (5min) > 8 for 8 cycles then restart if 5 restarts within 5 cycles then timeout
  9. Check file & directory check file access_log with path /var/log/access_log

    if size > 100 Mb then alert check directory sbin with path /sbin if changed timestamp then alert
  10. Monit command-line $> monit summary $> monit status $> monit

    reload $> monit start [all | name] $> monit stop [all | name]