$30 off During Our Annual Pro Sale. View Details »

God

 God

Covering usage of God, the ruby process monitoring tool. Presented at ATLRUG in 2008

Jesse Newland

October 02, 2011
Tweet

More Decks by Jesse Newland

Other Decks in Technology

Transcript

  1. od process and task monitoring done right Jesse Newland jnewland.com

    jesse@railsmachine.com g
  2. None
  3. FAILWHALE NEEDS NO INTRODUCTION

  4. Like it or not, the web is 24/7/365

  5. But who wants to be online 24/7/365?

  6. Sometimes, you’ve just gotta take a walk

  7. ZOMG WHAT NOW?

  8. Process monitoring

  9. sudo gem install god

  10. Tom Preston- Warner written by:

  11. git clone git://github.com/jnewland/god_examples.git Follow along at home

  12. The Basics

  13. $ ruby scripts/crashy.rb Wed Jul 09 13:53:13 -0400 2008 Wed

    Jul 09 13:53:14 -0400 2008 Wed Jul 09 13:53:15 -0400 2008 /Users/jnewland/src/god_examples/lib/god_test.rb:28:in `crash': Crash! (RuntimeError) from /Users/jnewland/src/god_examples/lib/god_test.rb:20:in `run' from /Users/jnewland/src/god_examples/lib/god_test.rb:19:in `loop' from /Users/jnewland/src/god_examples/lib/god_test.rb:19:in `run' from /Users/jnewland/src/god_examples/lib/god_test.rb:15:in `initialize' from scripts/crashy.rb:4:in `new' from scripts/crashy.rb:4
  14. #simple.god #The simplest possible watch God.watch do |w| w.name =

    'crashy' w.interval = 1.seconds w.start = 'ruby scripts/crashy.rb' w.start_if do |start| start.condition(:process_running) do |c| c.running = false end end end
  15. $ god -h ... Options: -c, --config-file CONFIG Configuration file

    -p, --port PORT Communications port (default 17165) -b, --auto-bind Auto-bind to an unused port number -P, --pid FILE Where to write the PID file -l, --log FILE Where to write the log file -D, --no-daemonize Don't daemonize -v, --version Print the version number and exit
  16. $ god -c simple.god -D [... 20:19:33 #10897] INFO: Using

    pid file directory: /Users/jnewland/.god/pids [... 20:19:34 #10897] INFO: Started on drbunix:///tmp/god.17165.sock [... 20:19:34 #10897] INFO: crashy move 'unmonitored' to 'up' [... 20:19:34 #10897] INFO: crashy moved 'unmonitored' to 'up' [... 20:19:34 #10897] INFO: crashy [trigger] process is not running (ProcessRunning) [... 20:19:34 #10897] INFO: crashy move 'up' to 'start' [... 20:19:34 #10897] INFO: crashy start: ruby scripts/crashy.rb [... 20:19:34 #10897] INFO: crashy moved 'up' to 'up' [... 20:19:34 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:35 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:36 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:37 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:38 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:39 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:40 #10897] INFO: crashy [trigger] process is not running (ProcessRunning) [... 20:19:40 #10897] INFO: crashy move 'up' to 'start' [... 20:19:40 #10897] INFO: crashy start: ruby scripts/crashy.rb [... 20:19:40 #10897] INFO: crashy moved 'up' to 'up' [... 20:19:40 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:41 #10897] INFO: crashy [ok] process is running (ProcessRunning)
  17. $ god -c simple.god -D [... 20:19:33 #10897] INFO: Using

    pid file directory: /Users/jnewland/.god/pids [... 20:19:34 #10897] INFO: Started on drbunix:///tmp/god.17165.sock [... 20:19:34 #10897] INFO: crashy move 'unmonitored' to 'up' [... 20:19:34 #10897] INFO: crashy moved 'unmonitored' to 'up' [... 20:19:34 #10897] INFO: crashy [trigger] process is not running (ProcessRunning) [... 20:19:34 #10897] INFO: crashy move 'up' to 'start' [... 20:19:34 #10897] INFO: crashy start: ruby scripts/crashy.rb [... 20:19:34 #10897] INFO: crashy moved 'up' to 'up' [... 20:19:34 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:35 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:36 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:37 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:38 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:39 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:40 #10897] INFO: crashy [trigger] process is not running (ProcessRunning) [... 20:19:40 #10897] INFO: crashy move 'up' to 'start' [... 20:19:40 #10897] INFO: crashy start: ruby scripts/crashy.rb [... 20:19:40 #10897] INFO: crashy moved 'up' to 'up' [... 20:19:40 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:41 #10897] INFO: crashy [ok] process is running (ProcessRunning)
  18. $ god -c simple.god -D [... 20:19:33 #10897] INFO: Using

    pid file directory: /Users/jnewland/.god/pids [... 20:19:34 #10897] INFO: Started on drbunix:///tmp/god.17165.sock [... 20:19:34 #10897] INFO: crashy move 'unmonitored' to 'up' [... 20:19:34 #10897] INFO: crashy moved 'unmonitored' to 'up' [... 20:19:34 #10897] INFO: crashy [trigger] process is not running (ProcessRunning) [... 20:19:34 #10897] INFO: crashy move 'up' to 'start' [... 20:19:34 #10897] INFO: crashy start: ruby scripts/crashy.rb [... 20:19:34 #10897] INFO: crashy moved 'up' to 'up' [... 20:19:34 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:35 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:36 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:37 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:38 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:39 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:40 #10897] INFO: crashy [trigger] process is not running (ProcessRunning) [... 20:19:40 #10897] INFO: crashy move 'up' to 'start' [... 20:19:40 #10897] INFO: crashy start: ruby scripts/crashy.rb [... 20:19:40 #10897] INFO: crashy moved 'up' to 'up' [... 20:19:40 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:41 #10897] INFO: crashy [ok] process is running (ProcessRunning)
  19. $ god -c simple.god -D [... 20:19:33 #10897] INFO: Using

    pid file directory: /Users/jnewland/.god/pids [... 20:19:34 #10897] INFO: Started on drbunix:///tmp/god.17165.sock [... 20:19:34 #10897] INFO: crashy move 'unmonitored' to 'up' [... 20:19:34 #10897] INFO: crashy moved 'unmonitored' to 'up' [... 20:19:34 #10897] INFO: crashy [trigger] process is not running (ProcessRunning) [... 20:19:34 #10897] INFO: crashy move 'up' to 'start' [... 20:19:34 #10897] INFO: crashy start: ruby scripts/crashy.rb [... 20:19:34 #10897] INFO: crashy moved 'up' to 'up' [... 20:19:34 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:35 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:36 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:37 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:38 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:39 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:40 #10897] INFO: crashy [trigger] process is not running (ProcessRunning) [... 20:19:40 #10897] INFO: crashy move 'up' to 'start' [... 20:19:40 #10897] INFO: crashy start: ruby scripts/crashy.rb [... 20:19:40 #10897] INFO: crashy moved 'up' to 'up' [... 20:19:40 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:41 #10897] INFO: crashy [ok] process is running (ProcessRunning)
  20. $ god -c simple.god -D [... 20:19:33 #10897] INFO: Using

    pid file directory: /Users/jnewland/.god/pids [... 20:19:34 #10897] INFO: Started on drbunix:///tmp/god.17165.sock [... 20:19:34 #10897] INFO: crashy move 'unmonitored' to 'up' [... 20:19:34 #10897] INFO: crashy moved 'unmonitored' to 'up' [... 20:19:34 #10897] INFO: crashy [trigger] process is not running (ProcessRunning) [... 20:19:34 #10897] INFO: crashy move 'up' to 'start' [... 20:19:34 #10897] INFO: crashy start: ruby scripts/crashy.rb [... 20:19:34 #10897] INFO: crashy moved 'up' to 'up' [... 20:19:34 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:35 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:36 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:37 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:38 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:39 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:40 #10897] INFO: crashy [trigger] process is not running (ProcessRunning) [... 20:19:40 #10897] INFO: crashy move 'up' to 'start' [... 20:19:40 #10897] INFO: crashy start: ruby scripts/crashy.rb [... 20:19:40 #10897] INFO: crashy moved 'up' to 'up' [... 20:19:40 #10897] INFO: crashy [ok] process is running (ProcessRunning) [... 20:19:41 #10897] INFO: crashy [ok] process is running (ProcessRunning)
  21. $ god -c simple.god $

  22. $ god -c simple.god $ ps ax | grep ruby

    12512 ?? Ss 0:00.03 ruby /Users/jnewland/src/god_examples/scripts/crashy.rb 12484 s001 S 0:00.36 /usr/bin/ruby /usr/bin/god -c simple.god
  23. $ god -c simple.god $ ps ax | grep ruby

    12512 ?? Ss 0:00.03 ruby /Users/jnewland/src/god_examples/scripts/crashy.rb 12484 s001 S 0:00.36 /usr/bin/ruby /usr/bin/god -c simple.god $ god -h ... Commands: start <task or group name> start task or group restart <task or group name> restart task or group stop <task or group name> stop task or group monitor <task or group name> monitor task or group unmonitor <task or group name> unmonitor task or group remove <task or group name> remove task or group from god load <file> load a config into a running god log <task name> show realtime log for given task status show status of each task quit stop god terminate stop god and all tasks check run self diagnostic
  24. $ god status crashy: up $ god restart crashy Sending

    'restart' command The following watches were affected: crashy $ god stop crashy Sending 'stop' command The following watches were affected: crashy $ god status crashy: unmonitored $ god start crashy Sending 'start' command The following watches were affected: crashy $ god status crashy: up
  25. Controlling Leaky Processes

  26. #leaky.god God.watch do |w| w.name = "leaky" w.interval = 5.seconds

    w.start = 'ruby scripts/leaky.rb' w.start_if do |start| start.condition(:process_running) do |c| c.running = false end end w.restart_if do |restart| restart.condition(:memory_usage) do |c| c.above = 2.megabytes end end end
  27. CPU Usage

  28. w.restart_if do |restart| restart.condition(:cpu_usage) do |c| c.above = 50.percent c.times

    = [3, 5] end end
  29. HTTP Status Codes

  30. w.restart_if do |restart| restart.condition(:http_response_code) do |c| c.host = 'localhost' c.port

    = '80' c.path = '/heartbeat' c.code_is_not = %w(200 304) end end
  31. Notifications

  32. #email_contacts.god God::Contacts::Email.message_settings = { :from => 'god@jnewland.com' } God::Contacts::Email.server_settings =

    { :address => "smtp.jnewland.com", :port => 25, :domain => "jnewland.com", :authentication => :plain, :user_name => "god", :password => "" } God.contact(:email) do |c| c.name = 'jesse' c.email = 'jnewland@gmail.com' end
  33. #http://github.com/mojombo/god/tree/master/lib/god/contacts/jabber.rb require 'jabber' God::Contacts::Jabber.settings = { :jabber_id => 'bot@jnewland.com', :password

    => ' ' } God.contact(:jabber) do |c| c.name = 'jesse' c.jabber_id = 'jnewland@gmail.com' end
  34. w.restart_if do |restart| restart.condition(:cpu_usage) do |c| c.above = 50.percent c.times

    = [3, 5] c.notify = "jesse" end end
  35. Monitoring Mongrels

  36. Putting it all together • Process Running • Memory Usage

    • CPU Usage • HTTP Response Code • Notifications • Capistrano? • Web Interface?
  37. #rails/config/god/app.god RAILS_ROOT = ENV['RAILS_ROOT'] ||= "/var/www/apps/test/current" RUBY = `which ruby`.chomp

    MONGREL_RAILS = `which mongrel_rails`.chomp RAILS_ENV = ENV['RAILS_ENV'] ||= 'production' MONGRELS = 2 MONGREL_START_PORT= 3000 USER = GROUP = 'deploy' 0.upto(MONGRELS-1) do |n| port = MONGREL_START_PORT+n God.watch do |w| w.group = 'mongrels' w.name = "mongrel_#{port}" w.uid = USER w.gid = GROUP w.interval = 30.seconds w.start = "#{RUBY} #{MONGREL_RAILS} start --environment #{RAILS_ENV} -- chdir #{RAILS_ROOT} --port #{port}" w.start_grace = 90.seconds w.restart_grace = 90.seconds w.log = File.join(RAILS_ROOT, "log/mongrel_#{port}.log") #process running #memory usage #cpu usage #http response code end do
  38. class PulseController < ApplicationController session :off def pulse if (ActiveRecord::Base.connection.execute("select

    1").num_rows rescue 0) == 1 render :text => "OK #{Time.now.utc.to_s(:db)}" else render :text => 'ERROR', :status => :internal_server_error end end end Pulse Controller
  39. Capistrano

  40. #rails/config/deploy.rb role :app, "test.jnewland.com" require 'san_juan' san_juan.role :app, %w(mongrels) #overwrite

    the default start, stop, and restart tasks to use god namespace :deploy do desc "Use god to restart the app" task :restart do god.all.reload god.app.mongrels.restart end desc "Use god to start the app" task :start do god.all.start end desc "Use god to stop the app" task :stop do god.all.terminate end end
  41. $ cap -T ... cap god:all:quit # Quit god, but

    not the processes it's monitoring cap god:all:reload # Reloading God Config cap god:all:start # Start god cap god:all:start_interactive # Start god interactively cap god:all:status # Describe the status of the running tasks on ... cap god:all:terminate # Terminate god and all monitored processes cap god:app:mongrels:log # Log mongrels cap god:app:mongrels:remove # Remove mongrels cap god:app:mongrels:restart # Restart mongrels cap god:app:mongrels:start # Start mongrels cap god:app:mongrels:stop # Stop mongrels cap god:app:mongrels:unmonitor # Unmonitor mongrels cap god:app:quit # Quit god, but not the processes it's monitoring cap god:app:reload # Reload the god config file cap god:app:start # Start god cap god:app:start_interactive # Start god interactively cap god:app:status # Describe the status of the running tasks cap god:app:terminate # Terminate god and all monitored processes ...
  42. http://github.com/jnewland/san_juan

  43. ZOMG WHAT NOW?

  44. #rails/config/god/app.god ... require 'god_web' GodWeb.watch(:port => 3003) ...

  45. None
  46. None
  47. http://github.com/jnewland/god_web

  48. Advanced Features

  49. #jabber_bot.god w.restart_if do |restart| restart.condition(:lambda) do |c| c.interval = 15.seconds

    c.lambda = lambda do require 'xmpp4r-simple' im = Jabber::Simple.new( 'god@jnewland.com', PASSWORDS['god@jnewland.com'] ) im.deliver('bot@jnewland.com', 'ping') sleep(5) return true unless im.received_messages? chat = im.received_messages.find { |msg| msg.type == :chat} return true unless chat.body =~ /pong/ end end end Lambda Conditions
  50. #custom_behavior.god module God module Behaviors class Speak < Behavior def

    before_start `say "Starting now"` 'announced start' end def before_stop `say "Stopping now"` 'announced stop' end end end end God.watch do |w| ... w.behavior(:speak) ... end Behaviors
  51. #mongrel_cluster.god require 'lib/god_mongrel_cluster' Dir.glob('/etc/mongrel_cluster/*.conf').each do |mongrel_cluster| cluster = GodMongrelCluster.new(mongrel_cluster) cluster.watch

    end mongrel_cluster
  52. Questions?

  53. http://www.flickr.com/photos/stuckincustoms/522313332/ http://www.flickr.com/photos/91499534@N00/2335651912/ http://www.flickr.com/photos/code_martial/1411893703/ http://www.flickr.com/photos/extranoise/163847669/ http://www.flickr.com/photos/vanz/2480741207/ http://www.flickr.com/photos/smartjunco/281071006/ http://www.flickr.com/photos/davesag/8312984/ http://www.flickr.com/photos/gaetanlee/298178764/ http://www.flickr.com/photos/vrogy/511644410/ http://www.flickr.com/photos/jeffsmallwood/299208539/

    http://www.flickr.com/photos/cjdaniel/2240123159/ http://www.flickr.com/photos/bobbygreg/139080175/ http://www.flickr.com/photos/lordelo/12958772/ Hooray Flickr! (And Creative Commons)
  54. http://creativecommons.org/licenses/by-sa/2.0/deed.en