Upgrade to Pro — share decks privately, control downloads, hide ads and more …

God

 God

Covering usage of God, the ruby process monitoring tool. Presented at ATLRUG in 2008

Jesse Newland

October 02, 2011
Tweet

More Decks by Jesse Newland

Other Decks in Technology

Transcript

  1. od
    process and task monitoring
    done right
    Jesse Newland
    jnewland.com
    [email protected]
    g

    View Slide

  2. View Slide

  3. FAILWHALE NEEDS
    NO INTRODUCTION

    View Slide

  4. Like it or not, the web is 24/7/365

    View Slide

  5. But who wants to be
    online 24/7/365?

    View Slide

  6. Sometimes, you’ve
    just gotta take a walk

    View Slide

  7. ZOMG WHAT NOW?

    View Slide

  8. Process monitoring

    View Slide

  9. sudo gem install god

    View Slide

  10. Tom
    Preston-
    Warner
    written
    by:

    View Slide

  11. git clone git://github.com/jnewland/god_examples.git Follow along at home

    View Slide

  12. The Basics

    View Slide

  13. $ ruby scripts/crashy.rb
    Wed Jul 09 13:53:13 -0400 2008
    Wed Jul 09 13:53:14 -0400 2008
    Wed Jul 09 13:53:15 -0400 2008
    /Users/jnewland/src/god_examples/lib/god_test.rb:28:in `crash': Crash!
    (RuntimeError)
    from /Users/jnewland/src/god_examples/lib/god_test.rb:20:in `run'
    from /Users/jnewland/src/god_examples/lib/god_test.rb:19:in `loop'
    from /Users/jnewland/src/god_examples/lib/god_test.rb:19:in `run'
    from /Users/jnewland/src/god_examples/lib/god_test.rb:15:in `initialize'
    from scripts/crashy.rb:4:in `new'
    from scripts/crashy.rb:4

    View Slide

  14. #simple.god
    #The simplest possible watch
    God.watch do |w|
    w.name = 'crashy'
    w.interval = 1.seconds
    w.start = 'ruby scripts/crashy.rb'
    w.start_if do |start|
    start.condition(:process_running) do |c|
    c.running = false
    end
    end
    end

    View Slide

  15. $ god -h
    ...
    Options:
    -c, --config-file CONFIG Configuration file
    -p, --port PORT Communications port (default 17165)
    -b, --auto-bind Auto-bind to an unused port number
    -P, --pid FILE Where to write the PID file
    -l, --log FILE Where to write the log file
    -D, --no-daemonize Don't daemonize
    -v, --version Print the version number and exit

    View Slide

  16. $ god -c simple.god -D
    [... 20:19:33 #10897] INFO: Using pid file directory: /Users/jnewland/.god/pids
    [... 20:19:34 #10897] INFO: Started on drbunix:///tmp/god.17165.sock
    [... 20:19:34 #10897] INFO: crashy move 'unmonitored' to 'up'
    [... 20:19:34 #10897] INFO: crashy moved 'unmonitored' to 'up'
    [... 20:19:34 #10897] INFO: crashy [trigger] process is not running (ProcessRunning)
    [... 20:19:34 #10897] INFO: crashy move 'up' to 'start'
    [... 20:19:34 #10897] INFO: crashy start: ruby scripts/crashy.rb
    [... 20:19:34 #10897] INFO: crashy moved 'up' to 'up'
    [... 20:19:34 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:35 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:36 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:37 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:38 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:39 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:40 #10897] INFO: crashy [trigger] process is not running (ProcessRunning)
    [... 20:19:40 #10897] INFO: crashy move 'up' to 'start'
    [... 20:19:40 #10897] INFO: crashy start: ruby scripts/crashy.rb
    [... 20:19:40 #10897] INFO: crashy moved 'up' to 'up'
    [... 20:19:40 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:41 #10897] INFO: crashy [ok] process is running (ProcessRunning)

    View Slide

  17. $ god -c simple.god -D
    [... 20:19:33 #10897] INFO: Using pid file directory: /Users/jnewland/.god/pids
    [... 20:19:34 #10897] INFO: Started on drbunix:///tmp/god.17165.sock
    [... 20:19:34 #10897] INFO: crashy move 'unmonitored' to 'up'
    [... 20:19:34 #10897] INFO: crashy moved 'unmonitored' to 'up'
    [... 20:19:34 #10897] INFO: crashy [trigger] process is not running (ProcessRunning)
    [... 20:19:34 #10897] INFO: crashy move 'up' to 'start'
    [... 20:19:34 #10897] INFO: crashy start: ruby scripts/crashy.rb
    [... 20:19:34 #10897] INFO: crashy moved 'up' to 'up'
    [... 20:19:34 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:35 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:36 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:37 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:38 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:39 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:40 #10897] INFO: crashy [trigger] process is not running (ProcessRunning)
    [... 20:19:40 #10897] INFO: crashy move 'up' to 'start'
    [... 20:19:40 #10897] INFO: crashy start: ruby scripts/crashy.rb
    [... 20:19:40 #10897] INFO: crashy moved 'up' to 'up'
    [... 20:19:40 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:41 #10897] INFO: crashy [ok] process is running (ProcessRunning)

    View Slide

  18. $ god -c simple.god -D
    [... 20:19:33 #10897] INFO: Using pid file directory: /Users/jnewland/.god/pids
    [... 20:19:34 #10897] INFO: Started on drbunix:///tmp/god.17165.sock
    [... 20:19:34 #10897] INFO: crashy move 'unmonitored' to 'up'
    [... 20:19:34 #10897] INFO: crashy moved 'unmonitored' to 'up'
    [... 20:19:34 #10897] INFO: crashy [trigger] process is not running (ProcessRunning)
    [... 20:19:34 #10897] INFO: crashy move 'up' to 'start'
    [... 20:19:34 #10897] INFO: crashy start: ruby scripts/crashy.rb
    [... 20:19:34 #10897] INFO: crashy moved 'up' to 'up'
    [... 20:19:34 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:35 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:36 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:37 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:38 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:39 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:40 #10897] INFO: crashy [trigger] process is not running (ProcessRunning)
    [... 20:19:40 #10897] INFO: crashy move 'up' to 'start'
    [... 20:19:40 #10897] INFO: crashy start: ruby scripts/crashy.rb
    [... 20:19:40 #10897] INFO: crashy moved 'up' to 'up'
    [... 20:19:40 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:41 #10897] INFO: crashy [ok] process is running (ProcessRunning)

    View Slide

  19. $ god -c simple.god -D
    [... 20:19:33 #10897] INFO: Using pid file directory: /Users/jnewland/.god/pids
    [... 20:19:34 #10897] INFO: Started on drbunix:///tmp/god.17165.sock
    [... 20:19:34 #10897] INFO: crashy move 'unmonitored' to 'up'
    [... 20:19:34 #10897] INFO: crashy moved 'unmonitored' to 'up'
    [... 20:19:34 #10897] INFO: crashy [trigger] process is not running (ProcessRunning)
    [... 20:19:34 #10897] INFO: crashy move 'up' to 'start'
    [... 20:19:34 #10897] INFO: crashy start: ruby scripts/crashy.rb
    [... 20:19:34 #10897] INFO: crashy moved 'up' to 'up'
    [... 20:19:34 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:35 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:36 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:37 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:38 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:39 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:40 #10897] INFO: crashy [trigger] process is not running (ProcessRunning)
    [... 20:19:40 #10897] INFO: crashy move 'up' to 'start'
    [... 20:19:40 #10897] INFO: crashy start: ruby scripts/crashy.rb
    [... 20:19:40 #10897] INFO: crashy moved 'up' to 'up'
    [... 20:19:40 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:41 #10897] INFO: crashy [ok] process is running (ProcessRunning)

    View Slide

  20. $ god -c simple.god -D
    [... 20:19:33 #10897] INFO: Using pid file directory: /Users/jnewland/.god/pids
    [... 20:19:34 #10897] INFO: Started on drbunix:///tmp/god.17165.sock
    [... 20:19:34 #10897] INFO: crashy move 'unmonitored' to 'up'
    [... 20:19:34 #10897] INFO: crashy moved 'unmonitored' to 'up'
    [... 20:19:34 #10897] INFO: crashy [trigger] process is not running (ProcessRunning)
    [... 20:19:34 #10897] INFO: crashy move 'up' to 'start'
    [... 20:19:34 #10897] INFO: crashy start: ruby scripts/crashy.rb
    [... 20:19:34 #10897] INFO: crashy moved 'up' to 'up'
    [... 20:19:34 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:35 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:36 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:37 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:38 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:39 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:40 #10897] INFO: crashy [trigger] process is not running (ProcessRunning)
    [... 20:19:40 #10897] INFO: crashy move 'up' to 'start'
    [... 20:19:40 #10897] INFO: crashy start: ruby scripts/crashy.rb
    [... 20:19:40 #10897] INFO: crashy moved 'up' to 'up'
    [... 20:19:40 #10897] INFO: crashy [ok] process is running (ProcessRunning)
    [... 20:19:41 #10897] INFO: crashy [ok] process is running (ProcessRunning)

    View Slide

  21. $ god -c simple.god
    $

    View Slide

  22. $ god -c simple.god
    $ ps ax | grep ruby
    12512 ?? Ss 0:00.03 ruby /Users/jnewland/src/god_examples/scripts/crashy.rb
    12484 s001 S 0:00.36 /usr/bin/ruby /usr/bin/god -c simple.god

    View Slide

  23. $ god -c simple.god
    $ ps ax | grep ruby
    12512 ?? Ss 0:00.03 ruby /Users/jnewland/src/god_examples/scripts/crashy.rb
    12484 s001 S 0:00.36 /usr/bin/ruby /usr/bin/god -c simple.god
    $ god -h
    ...
    Commands:
    start start task or group
    restart restart task or group
    stop stop task or group
    monitor monitor task or group
    unmonitor unmonitor task or group
    remove remove task or group from god
    load load a config into a running god
    log show realtime log for given task
    status show status of each task
    quit stop god
    terminate stop god and all tasks
    check run self diagnostic

    View Slide

  24. $ god status
    crashy: up
    $ god restart crashy
    Sending 'restart' command
    The following watches were affected:
    crashy
    $ god stop crashy
    Sending 'stop' command
    The following watches were affected:
    crashy
    $ god status
    crashy: unmonitored
    $ god start crashy
    Sending 'start' command
    The following watches were affected:
    crashy
    $ god status
    crashy: up

    View Slide

  25. Controlling
    Leaky Processes

    View Slide

  26. #leaky.god
    God.watch do |w|
    w.name = "leaky"
    w.interval = 5.seconds
    w.start = 'ruby scripts/leaky.rb'
    w.start_if do |start|
    start.condition(:process_running) do |c|
    c.running = false
    end
    end
    w.restart_if do |restart|
    restart.condition(:memory_usage) do |c|
    c.above = 2.megabytes
    end
    end
    end

    View Slide

  27. CPU
    Usage

    View Slide

  28. w.restart_if do |restart|
    restart.condition(:cpu_usage) do |c|
    c.above = 50.percent
    c.times = [3, 5]
    end
    end

    View Slide

  29. HTTP Status Codes

    View Slide

  30. w.restart_if do |restart|
    restart.condition(:http_response_code) do |c|
    c.host = 'localhost'
    c.port = '80'
    c.path = '/heartbeat'
    c.code_is_not = %w(200 304)
    end
    end

    View Slide

  31. Notifications

    View Slide

  32. #email_contacts.god
    God::Contacts::Email.message_settings = {
    :from => '[email protected]'
    }
    God::Contacts::Email.server_settings = {
    :address => "smtp.jnewland.com",
    :port => 25,
    :domain => "jnewland.com",
    :authentication => :plain,
    :user_name => "god",
    :password => ""
    }
    God.contact(:email) do |c|
    c.name = 'jesse'
    c.email = '[email protected]'
    end

    View Slide

  33. #http://github.com/mojombo/god/tree/master/lib/god/contacts/jabber.rb
    require 'jabber'
    God::Contacts::Jabber.settings = {
    :jabber_id => '[email protected]',
    :password => ' ' }
    God.contact(:jabber) do |c|
    c.name = 'jesse'
    c.jabber_id = '[email protected]'
    end

    View Slide

  34. w.restart_if do |restart|
    restart.condition(:cpu_usage) do |c|
    c.above = 50.percent
    c.times = [3, 5]
    c.notify = "jesse"
    end
    end

    View Slide

  35. Monitoring
    Mongrels

    View Slide

  36. Putting it all together
    • Process Running
    • Memory Usage
    • CPU Usage
    • HTTP Response Code
    • Notifications
    • Capistrano?
    • Web Interface?

    View Slide

  37. #rails/config/god/app.god
    RAILS_ROOT = ENV['RAILS_ROOT'] ||= "/var/www/apps/test/current"
    RUBY = `which ruby`.chomp
    MONGREL_RAILS = `which mongrel_rails`.chomp
    RAILS_ENV = ENV['RAILS_ENV'] ||= 'production'
    MONGRELS = 2
    MONGREL_START_PORT= 3000
    USER = GROUP = 'deploy'
    0.upto(MONGRELS-1) do |n|
    port = MONGREL_START_PORT+n
    God.watch do |w|
    w.group = 'mongrels'
    w.name = "mongrel_#{port}"
    w.uid = USER
    w.gid = GROUP
    w.interval = 30.seconds
    w.start = "#{RUBY} #{MONGREL_RAILS} start --environment #{RAILS_ENV} --
    chdir #{RAILS_ROOT} --port #{port}"
    w.start_grace = 90.seconds
    w.restart_grace = 90.seconds
    w.log = File.join(RAILS_ROOT, "log/mongrel_#{port}.log")
    #process running
    #memory usage
    #cpu usage
    #http response code
    end
    do

    View Slide

  38. class PulseController < ApplicationController
    session :off
    def pulse
    if (ActiveRecord::Base.connection.execute("select 1").num_rows rescue 0) == 1
    render :text => "OK #{Time.now.utc.to_s(:db)}"
    else
    render :text => 'ERROR', :status => :internal_server_error
    end
    end
    end
    Pulse Controller

    View Slide

  39. Capistrano

    View Slide

  40. #rails/config/deploy.rb
    role :app, "test.jnewland.com"
    require 'san_juan'
    san_juan.role :app, %w(mongrels)
    #overwrite the default start, stop, and restart tasks to use god
    namespace :deploy do
    desc "Use god to restart the app"
    task :restart do
    god.all.reload
    god.app.mongrels.restart
    end
    desc "Use god to start the app"
    task :start do
    god.all.start
    end
    desc "Use god to stop the app"
    task :stop do
    god.all.terminate
    end
    end

    View Slide

  41. $ cap -T
    ...
    cap god:all:quit # Quit god, but not the processes it's monitoring
    cap god:all:reload # Reloading God Config
    cap god:all:start # Start god
    cap god:all:start_interactive # Start god interactively
    cap god:all:status # Describe the status of the running tasks on ...
    cap god:all:terminate # Terminate god and all monitored processes
    cap god:app:mongrels:log # Log mongrels
    cap god:app:mongrels:remove # Remove mongrels
    cap god:app:mongrels:restart # Restart mongrels
    cap god:app:mongrels:start # Start mongrels
    cap god:app:mongrels:stop # Stop mongrels
    cap god:app:mongrels:unmonitor # Unmonitor mongrels
    cap god:app:quit # Quit god, but not the processes it's monitoring
    cap god:app:reload # Reload the god config file
    cap god:app:start # Start god
    cap god:app:start_interactive # Start god interactively
    cap god:app:status # Describe the status of the running tasks
    cap god:app:terminate # Terminate god and all monitored processes
    ...

    View Slide

  42. http://github.com/jnewland/san_juan

    View Slide

  43. ZOMG WHAT NOW?

    View Slide

  44. #rails/config/god/app.god
    ...
    require 'god_web'
    GodWeb.watch(:port => 3003)
    ...

    View Slide

  45. View Slide

  46. View Slide

  47. http://github.com/jnewland/god_web

    View Slide

  48. Advanced
    Features

    View Slide

  49. #jabber_bot.god
    w.restart_if do |restart|
    restart.condition(:lambda) do |c|
    c.interval = 15.seconds
    c.lambda = lambda do
    require 'xmpp4r-simple'
    im = Jabber::Simple.new(
    '[email protected]',
    PASSWORDS['[email protected]']
    )
    im.deliver('[email protected]', 'ping')
    sleep(5)
    return true unless im.received_messages?
    chat = im.received_messages.find { |msg| msg.type == :chat}
    return true unless chat.body =~ /pong/
    end
    end
    end
    Lambda Conditions

    View Slide

  50. #custom_behavior.god
    module God
    module Behaviors
    class Speak < Behavior
    def before_start
    `say "Starting now"`
    'announced start'
    end
    def before_stop
    `say "Stopping now"`
    'announced stop'
    end
    end
    end
    end
    God.watch do |w|
    ...
    w.behavior(:speak)
    ...
    end
    Behaviors

    View Slide

  51. #mongrel_cluster.god
    require 'lib/god_mongrel_cluster'
    Dir.glob('/etc/mongrel_cluster/*.conf').each do |mongrel_cluster|
    cluster = GodMongrelCluster.new(mongrel_cluster)
    cluster.watch
    end
    mongrel_cluster

    View Slide

  52. Questions?

    View Slide

  53. http://www.flickr.com/photos/stuckincustoms/522313332/
    http://www.flickr.com/photos/[email protected]/2335651912/
    http://www.flickr.com/photos/code_martial/1411893703/
    http://www.flickr.com/photos/extranoise/163847669/
    http://www.flickr.com/photos/vanz/2480741207/
    http://www.flickr.com/photos/smartjunco/281071006/
    http://www.flickr.com/photos/davesag/8312984/
    http://www.flickr.com/photos/gaetanlee/298178764/
    http://www.flickr.com/photos/vrogy/511644410/
    http://www.flickr.com/photos/jeffsmallwood/299208539/
    http://www.flickr.com/photos/cjdaniel/2240123159/
    http://www.flickr.com/photos/bobbygreg/139080175/
    http://www.flickr.com/photos/lordelo/12958772/
    Hooray Flickr! (And Creative Commons)

    View Slide

  54. http://creativecommons.org/licenses/by-sa/2.0/deed.en

    View Slide