Upgrade to Pro — share decks privately, control downloads, hide ads and more …

RailsConf 2012 - Stack Smashing (Cornflower Blue)

RailsConf 2012 - Stack Smashing (Cornflower Blue)

David Czarnecki

April 25, 2012
Tweet

More Decks by David Czarnecki

Other Decks in Programming

Transcript

  1. stack smashing railsconf 2012 http://speakerdeck.com/u/czarneckid/

  2. david czarnecki

  3. twitter @czarneckid

  4. github/czarneckid

  5. work @agoragames

  6. github/agoragames

  7. infrastructure insanity

  8. CEO priority

  9. 1 month tour of duty

  10. simplify (allthethings)

  11. document (allthethings)

  12. network overview

  13. 15 applications intertwined

  14. SSO, Profile Service Community Pro Circuit, Live Experience Store Photo

    Tool, Carousel Tool, League Tool Entitlements, Redemption Starcraft Arena MLG.tv Progamer, Pro Stats
  15. app * capacity

  16. 15 * 2 = 30 VMs

  17. VM profile: 1 GB RAM, 40 GB disk

  18. MLG traffic #s: 4MM views 1.2MM uniques 35MM page views

  19. quickly need a lot of VMs

  20. more servers, more problems

  21. we <3 hardware: 4 processors 6 cores/processor 64 GB RAM

    146 GB disks
  22. chef recipes

  23. mostly stock

  24. application migration

  25. start internal

  26. end external

  27. iteration breeds abstraction

  28. application upgrading

  29. embrace the pipeline

  30. git checkout -b rails32

  31. gem “rails”, “3.2.0”

  32. asset pipeline is OPTIONAL

  33. Gemfile group :assets do gem 'sass-rails', '~> 3.2.3' gem 'coffee-rails',

    '~> 3.2.1' gem 'compass', '= 0.12.alpha.4' gem 'uglifier', '~> 1.0.3' end gem ‘jquery-rails’
  34. application.rb if defined?(Bundler) # If you precompile assets before deploying

    to production, use this line Bundler.require(*Rails.groups(:assets => %w(development test))) # If you want your assets lazily compiled in production, use this line # Bundler.require(:default, :assets, Rails.env) end ... # Enable the asset pipeline config.assets.enabled = true
  35. production.rb # Compress assets and add digests config.assets.compress = true

    config.assets.js_compressor = Uglifier.new(:copyright => false) if defined?(Uglifier) config.assets.digest = true # Precompile additional assets (application.js, application.css, and all non-JS/CSS are already added) config.assets.precompile += %w( home.css home.js admin.css admin.js custom-application.js ) config.assets.precompile += [/plugins\/jquery\.ui\.selectmenu\.(css|js) $/, /plugins\/jquery\.gwfselect\.(css|js)$/]
  36. terminal $ mkdir app/assets $ mv public/images/ app/assets/ $ mv

    public/javascripts/ app/assets/ $ mv public/stylesheets/ app/assets/
  37. the unicorn

  38. wicked fast

  39. kernel load-balancing

  40. can do rolling restarts

  41. signal for capacity

  42. sv-rails-run.erb #!/bin/bash exec 2>&1 <% unicorn_command = @options[:unicorn_command] || 'unicorn_rails'

    -%> test -f /var/rails/.rvm/scripts/rvm || exit 1 exec /usr/bin/sudo -u rails -i <<END export HOME=/var/rails source /var/rails/.rvm/scripts/rvm || exit 1 cd <%= @options[:root] %> || exit 1 exec bundle exec <%= unicorn_command %> -c config/unicorn.rb - E <%= @options[:environment] %> END
  43. unicorn.rb rails_env = ENV['RAILS_ENV'] || 'production' worker_processes (rails_env == 'production'

    ? 4 : 1) preload_app true # Restart any workers that haven't responded in 30 seconds timeout 30 # Listen on a Unix data socket case rails_env when 'production' || 'staging' listen "/var/rails/application/tmp/sockets/ #{rails_env}.sock", :backlog => 2048 else listen "#{`pwd`.strip}/tmp/sockets/#{rails_env}.sock" end
  44. service configuration

  45. if a server fails ...

  46. does it make a sound?

  47. no, you get a phone call

  48. the usual suspect

  49. database.yml production: adapter: mysql2 host: machine-name reconnect: true pool: 5

    database: appname_production username: secret password: sup3rs3cr3t encoding: utf8
  50. spot a problem?

  51. database.yml production: adapter: mysql2 host: machine-name reconnect: true pool: 5

    database: appname_production username: secret password: sup3rs3cr3t encoding: utf8
  52. databases never fail!?

  53. how about an alias?

  54. /etc/bind/db.int ... mysql.yourcompany IN CNAME machine-name mysql-slave. yourcompany IN CNAME

    another-machine-name ...
  55. database.yml production: adapter: mysql2 host: mysql.yourcompany.int reconnect: true pool: 5

    database: appname_production username: secret password: sup3rs3cr3t encoding: utf8
  56. no re-deploys for failure!

  57. simply update DNS

  58. do this for redis

  59. and maybe memcached

  60. or for any services

  61. offline processing

  62. who doesn’t use resque?

  63. rails recipe resque-aware

  64. Individual application ... 'application' => { :root => '/var/rails/application/current', :environment

    => 'production', :queues => {'application_queue' => 4, 'application_mailer' => 1, 'application_checkin_expiration' => 1}, :queue_intervals => {}, :resque_log => '/var/rails/application/shared/ log/resque.log' }, ...
  65. node[:ruby][:sites].each_pair do |site, opts| runit_service site do owner 'rails' group

    'rails' template_name 'rails' options opts end if opts[:resque_scheduler] == `hostname`.strip runit_service "resque-scheduler-#{site}" do owner 'rails' group 'rails' template_name 'resque-scheduler' options opts end end if opts[:queues] opts[:queues].each do |queue, workers| options_for_template = opts.dup options_for_template[:queue] = queue 1.upto(workers) do |index| runit_service "resque-#{site}-#{queue}-worker-#{index}" do owner 'rails' group 'rails' template_name 'resque' options options_for_template end end end end end
  66. sv-resque-run.erb #!/bin/bash exec 2>&1 test -f /var/rails/.rvm/scripts/rvm || exit 1

    exec /usr/bin/sudo -u rails -i <<END export HOME=/var/rails source /var/rails/.rvm/scripts/rvm || exit 1 cd <%= @options[:root] %> || exit 1 RAILS_ENV=<%= @options[:environment] %> QUEUES=<%= @options[:queue] %> INTERVAL=<%= @options[:queue_intervals] [@options[:queue]] || '5' %> exec bundle exec rake environment resque:work >><%= @options[:resque_log] %> 2>&1 END
  67. sexy capistrano

  68. sexy == DRY

  69. deploy.rb

  70. deploy.rb set :application, 'some-mlg-app' set :application_server, 'unicorn' require 'capistrano/agora/base' load

    'deploy' if respond_to?(:namespace) require 'capistrano/agora/airbrake' require 'capistrano/agora/assets' require 'capistrano/agora/rvm' require 'capistrano/agora/hipchat' set :hipchat_room_name, 'Some MLG Application' require 'capistrano/agora/logging' require 'capistrano/agora/resque' require 'capistrano/agora/symlinks' require 'capistrano/agora/sv' require 'capistrano/agora/unicorn' set :resque_queues, { 'some-mlg-app.retrieve_stuff' => 4, 'some-mlg-app.update_stuff' => 1, 'some-mlg-app.email_stuff' => 1 } set :asset_directory, 'public/players' before 'deploy:restart', 'deploy:assets:precompile_with_skip'
  71. gem capistrano-agora

  72. common functionality

  73. Gemfile ... group :deploy do gem 'capistrano' gem 'capistrano-ext' gem

    'capistrano-agora' gem 'hipchat' end ...
  74. sensible for > 1 app

  75. capistrano-agora/ airbrake assets base helpers hipchat logging resque rvm sv

    symlinks unicorn version
  76. base.rb Capistrano::Configuration.instance.load do default_run_options[:pty] = true ssh_options[:forward_agent] = true set

    :scm, :git set :deploy_via, :remote_cache set :keep_releases, 7 set :use_sudo, false set :branch, fetch(:branch, "master") unless exists?(:branch) set :gateway, "#{fetch(:user, `whoami`.strip)}@your.dmz.com" unless exists?(:gateway) set :repository, "[email protected]:agoragames/#{application}.git" unless exists?(:repository) set :deploy_to, "/var/rails/#{application}" unless exists?(:deploy_to) set :shared_nfs_dir, "/var/shared/rails/#{application}" end
  77. assets.rb Capistrano::Configuration.instance.load do set :asset_directory, 'public/assets' set :assets_dependencies, %w(app/assets vendor/assets

    Gemfile.lock config/routes.rb) namespace :deploy do namespace :assets do task :precompile_with_skip, :roles => :web, :except => { :no_release => true } do from = source.next_revision(current_revision) if capture("cd #{latest_release} && #{source.local.log(previous_revision, current_revision)} #{assets_dependencies.join(' ')} | wc -l").to_i > 0 run "cd #{fetch(:current_path)} && bundle exec rake assets:precompile RAILS_ENV=#{rails_env}" else logger.info "Skipping asset pre-compilation because there were no asset changes. Copying assets from #{previous_release}." run "cp -R #{previous_release}/#{asset_directory} #{latest_release}/ #{asset_directory}" end end end end end
  78. resque.rb Capistrano::Configuration.instance.load do namespace :resque do desc <<-DESC Restart the

    Resque workers for an application after a deploy or deploy:migrations DESC task :restart_workers, :roles => :app, :except => { :no_release => true } do if exists?(:resque_queues) fetch(:resque_queues).each do |queue_name, worker_count| 1.upto(worker_count) do |worker_index| run "sv restart resque-#{fetch(:application)}-#{queue_name}-worker- #{worker_index}" end end else logger.info('You must define the :resque_queues variable for the resque:restart_workers task to work') end end end after "deploy", "resque:restart_workers" after "deploy:migrations", "resque:restart_workers" end
  79. unicorn.rb require 'capistrano/agora/helpers' Capistrano::Configuration.instance.load do namespace :unicorn do desc 'Increase

    number of unicorn workers' task :increase_workers, :roles => :app do num_workers = fetch(:num_workers, 1) unicorn_hosts = fetch(:unicorn_hosts, ['host1', 'host2']) unicorn_hosts.each do |host| worker_process_id = capture("cat /etc/sv/#{fetch(:application)}/supervise/pid", :hosts => host).chomp 1.upto(num_workers.to_i) do run("kill -TTIN #{worker_process_id}", :hosts => host) end end end desc 'Decrease number of unicorn workers' task :decrease_workers, :roles => :app do num_workers = fetch(:num_workers, 1) unicorn_hosts = fetch(:unicorn_hosts, ['host1', 'host2']) unicorn_hosts.each do |host| worker_process_id = capture("cat /etc/sv/#{fetch(:application)}/supervise/pid", :hosts => host).chomp 1.upto(num_workers.to_i) do run("kill -TTOU #{worker_process_id}", :hosts => host) end end end end end
  80. application monitoring

  81. it must be visual

  82. it must be historical

  83. it must be accessible

  84. we are using Munin

  85. spot problems

  86. spot opportunity

  87. None
  88. infrastructure monitoring

  89. system engineering project

  90. they chose cucumber

  91. rackspace-validations

  92. runs every 5 minutes

  93. step definitions

  94. step_definitions/ command_steps.rb dns_steps.rb file_steps.rb ping_steps.rb

  95. command_steps.rb def retry_times(times) begin yield rescue case times -= 1

    when 0 raise else retry end end end When /^I go to "(https?:\/\/[^"]*)"$/ do |uri| begin @host = uri.host retry_times(3) { http(uri) } rescue Errno::ECONNREFUSED @output = 'connection refused' @status = 127 rescue Timeout::Error @output = 'execution expired' @status = 127 end end
  96. command_steps.rb Then /^the (?:output|response) should (?:include|contain) "([^"]*)"$/ do |string| string

    = eval("\"#{string}\"") assert @output.include?(string), "expected to find \"#{string}\" in the output from #{@host}, but did not" end Then /^the (?:output|response) should not (?:include|contain) "([^"]*)"$/ do |string| string = eval("\"#{string}\"") assert [email protected]?(string), "expected not to find \"#{string}\" in the output from #{@host}, but did" end Then /^the (?:output|response) should (?:include|contain) "([^"]*X[^"]*)", where X is less than (\d+)$/ do |string, value| string = eval("\"#{string}\"") regex = Regexp.new(string.sub('X', '(\d+)')) assert @output =~ regex x = $1.to_i assert x < value, "expected \"#{x}\" to be less than \"#{value}\" in the output from #{@host}" end
  97. dns_steps.rb When /^I do a DNS lookup for ([\w\-\.]+)$/ do

    |name| @host = name begin @alias = Socket.gethostbyname(name).first rescue SocketError @alias = nil end end Then /^it should point (?:at|to) ([\w\-\.]+)$/ do |name| assert_equal name, @alias, "expected #{@host} to CNAME to #{@alias}, but it didn't" end
  98. file_steps.rb Before do @stats = [] end When /^I stat

    (\S+)$/ do |glob| Dir[glob].each do |path| @stats << File.stat(path) end end Then /^there should be at least (\d+) files?$/ do |count| assert @stats.length >= count end Then /^the most recently modified file should be less than (\d+) (\w+)s? old$/ do |count, unit| assert @stats.collect { |stat| stat.mtime }.max > time_ago(count, unit) end
  99. ping_steps.rb When /^I ping ([\w\-\.]+)$/ do |host| @host = host

    body = nil IO.popen(['/bin/ping', '-c', '1', '-n', host]) { |io| body = io.read } if $?.to_i == 0 body =~ /^rtt min\/avg\/max\/mdev = (\d\.\d{3}+)\/(\d\.\d{3}+)\/(\d\.\d{3}+)\/(\d\.\d{3}+) ms$/ @status = true @response = $2.to_f else @status = false @response = 0.0 end end Then /^I receive a response$/ do assert @status, "did not receive a response from #{@host}" end Then /^I receive a response within (\d+)(?: ?ms| milliseconds)$/ do |ms| assert @status, "did not receive a response from #{@host}" assert @response < ms, "response time from #{@host} of #{@response} was slower than #{ms} milliseconds" end
  100. <component>.feature

  101. features/ applications.feature haproxy.feature memcache.feature mongodb.feature mysql.feature nginx.feature redis.feature varnish.feature

  102. applications.feature

  103. Feature: Applications @critical Scenario: Ensure the company site is available

    When I go to "http://www.yourcompany.com/" Then the response should either be a 200 or a 302 to "http://www.yourcompany.com/whateva"
  104. mongodb.feature

  105. Feature: MongoDB In order to support our environment And avoid

    lost data resulting from system failure or incompetence As a responsible system administrator I want to ensure that our MongoDB databases are running as expected And that we have good slaves of the masters (well, secondaries to our primaries...) And that our snapshotted backups are up to date @critical Scenario Outline: Ensure MongoD is running When I SSH to <host> and run `pgrep mongod` Then it should exit successfully Examples: | host | | mongo-primary.int | | mongo-secondary.int | | mongo-secondary.int |
  106. redis.feature

  107. Feature: Redis @critical Scenario Outline: Ensure a Redis host is

    up and running the expected version When I open a socket to <host>:<port> and send "INFO\r\n" Then the output should include "redis_version:<version>" Examples: | host | port | version | | redis.int | 6379 | 2.4.2 | | redis-staging.int | 6379 | 2.4.2 |
  108. continuous integration

  109. we use Jenkins

  110. builds all internal projects

  111. builds all internal gems

  112. internal Gem In A Box-secure

  113. Gemfile source 'http://rubygems.org' source 'https://username:[email protected]' ...

  114. :git => into your organization

  115. shields you from infosuicide

  116. build failure?

  117. notification via e-mail

  118. notification via HipChat

  119. notify (allthethings)

  120. continuous deployment

  121. GitHub flow

  122. master is deployable

  123. features in branches

  124. post-build script trigger

  125. PostBuildScript #!/bin/bash if [ "$GIT_BRANCH" == "origin/HEAD" ] || [

    "$GIT_BRANCH" == 'master' ]; then curl "http://continuous-integration.com/job/project-deploy/build" else echo "$GIT_BRANCH, not deploying" fi
  126. execute only if build succeeds: check

  127. Jenkins project-deploy

  128. Build #!/bin/bash source "$HOME/.rvm/scripts/rvm" [[ -s ".rvmrc" ]] && source

    .rvmrc bundle install && bundle exec cap production deploy:migrations -S user=deploy -S branch=$GIT_BRANCH
  129. recap (allthethings)

  130. simplify

  131. upgrade (small to big)

  132. alias

  133. DRY

  134. monitor

  135. validate

  136. integrate

  137. stack smashing http://speakerdeck.com/u/czarneckid/ david czarnecki twitter @czarneckid github/czarneckid github/agoragames