Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Legion intro

Nathan Hopkins
August 25, 2013
38

Legion intro

Ruby concurrency made easy

Nathan Hopkins

August 25, 2013
Tweet

Transcript

  1. • Rails project • Data import • Lots of data

    • Data manipulation THE PROJECT
  2. • Chicago crime data - 1 GB CSV • Save

    CSV records to database • Convert longitude/latitude to static image of map • Various sizes thumbnail, medium, large • Rails + Carrierwave REQUIREMENTS
  3. class Importer def import_csv_row(row) record = ChicagoCrime.new( row.headers.reduce({}) do |memo,

    name| column = name.downcase.gsub(/\s/, "_").to_s column = :external_id if column == :id memo[column] = row[name] memo end ) #sleep 1 record.static_map = File.open("/Users/nathan/wo record.save! end end
  4. desc "Import data" task :import => :environment do require "csv"

    require_relative "../importer" importer = Importer.new count = 0 path = File.expand_path("../../../data/Crimes_-_2 CSV.foreach(path, :headers => true) do |row| importer.import_csv_row row print "#{count += 1}," end end
  5. CONCURRENCY • Threads with JRuby or Rubinius? • Celluloid with

    JRuby or Rubinius? • with MRI? • Background Jobs? • fork?
  6. class LegionImporter < Legion::Object before :import_csv_row do ActiveRecord::Base.establish_connection ActiveRe end

    def import_csv_row(row) record = ChicagoCrime.new( row.headers.reduce({}) do |memo, name| column = name.downcase.gsub(/\s/, "_").to_sy column = :external_id if column == :id memo[column] = row[name] memo end ) # sleep 1 record.static_map = File.open("/Users/nathan/wor record.save! end end
  7. class LegionImporter < Legion::Object before :import_csv_row do ActiveRecord::Base.establish_connection ActiveRe end

    def import_csv_row(row) record = ChicagoCrime.new( row.headers.reduce({}) do |memo, name| column = name.downcase.gsub(/\s/, "_").to_sy column = :external_id if column == :id memo[column] = row[name] memo end ) # sleep 1 record.static_map = File.open("/Users/nathan/wor record.save! end end
  8. desc "Import data with legion" task :import_with_legion => :environment do

    require "csv" require_relative "../legion_importer" supervisor = Legion::Supervisor.new( LegionImporter, :processes => 8, :port => 42042 ) supervisor.start count = 0 path = File.expand_path("../../../data/Crimes_-_20 CSV.foreach(path, :headers => true) do |row| supervisor.import_csv_row row print "#{count += 1}," end supervisor.stop end
  9. desc "Import data with legion" task :import_with_legion => :environment do

    require "csv" require_relative "../legion_importer" supervisor = Legion::Supervisor.new( LegionImporter, :processes => 8, :port => 42042 ) supervisor.start count = 0 path = File.expand_path("../../../data/Crimes_-_20 CSV.foreach(path, :headers => true) do |row| supervisor.import_csv_row row print "#{count += 1}," end supervisor.stop end
  10. GOTCHAS • Legion::Object subclasses must define async methods directly •

    Remember you’re dealing with a forked process • Rails must reconnect to the database with a :before callback • Relative file paths don’t work