Upgrade to Pro — share decks privately, control downloads, hide ads and more …

High Performance APIs in Ruby using ActiveRecord and Goliath+Synchrony+EventMachine

High Performance APIs in Ruby using ActiveRecord and Goliath+Synchrony+EventMachine

We had a real-time API in Rails that needed much lower latency and massive throughput. We wanted to preserve our investment in business logic inside ActiveRecord models while scaling up to 1000X throughput and cutting latency in half. Conventional wisdom would say this was impossible in Ruby, but we succeeded and actually surpassed our expectations. We'll discuss how we did it, using EventMachine for the reactor pattern, Synchrony to avoid callback hell and to make testing easy, Goliath as the non-blocking web server, and sharding across many cooperative processes.
Presented by Dan Kozlowski @rubious_dan and Colin Kelley @colindkelley
Demo repo: https://github.com/Invoca/railsconf2015

Colin Kelley

April 22, 2015
Tweet

Other Decks in Programming

Transcript

  1. The RingPool API { 
 “Search” : “toyoto santa barbara”,


    “VIN” : “SJKHSDJFHKJHSDFK”,
 “Model” : “Toyota Camry”
 } { “phone_number” : “(800) 555-1234” }
  2. The RingPool API RingPools contain 2-500 phone numbers Phone numbers

    are allocated (locked-in) for a period of time. No phone numbers available? Every RingPool has an overflow number. Overflow numbers provide no attribution data
  3. Requests per second PN per request Response Time (90%) Existing

    API 5 1 ~350ms Client Requirements >=1000 >=40 <200ms
  4. The Requirements • Predictable low latency • Elastically scalable •

    Can’t swamp FEs (asynchronous, limited, queued) • Extra credit if the solution is in Ruby
  5. The GIL Global Interpreter Lock: only one thread can run

    per CPU at a time. Implemented in MRI Ruby because: ✅ Simplifies interpreters’ design. ✅ Prevents mutation of shared data. Prevents true parallelism.
  6. 0% 25% 50% 75% 100% CPU 1 CPU 2 CPU

    3 CPU 4 No GIL: Sharing cores is caring!
  7. 0% 25% 50% 75% 100% CPU 1 CPU 2 CPU

    3 CPU 4 GIL processes are greedy.
  8. Reactor Pattern TL;DR: Don’t block the reactor. Actions that would

    block are performed later by asynchronous callback. Example: HTTP GET, MySQL INSERT, etc. Not a new idea: libraries available in Javascript, Python, Ruby…
  9. EM::Synchrony No callbacks. When code blocks reactor: 1. State is

    stored in the Fiber’s stack 2. Reactor proceeds to other work 3. When unblocked, the Fiber is resumed Linear code!
  10. Depency Hell class User < ActiveRecord::Base
 belongs_to :account # this

    is OK 
 validates :email_address, length: { maximum: EmailAddress::NAME_LEN }
 ...
 end
 
 class EmailAddress
 def self.create_new(name_addr, default_domain = Network::DEFAULT_DOMAIN)
 ...
 end
 end class Network
 ...
  11. ActiveRecord CPU If you want maximum density in your API

    server, beware of ActiveRecord CPU hogs: ActiveRecord object loads Extensive validations and other callbacks
  12. ExceptionalSynchrony::CallbackExceptions EventMachine callback + errback is confusing and not DRY.

    four_square.callback do
 data = JSON.parse(four_sq.response)['response']['groups'].first
 locations =
 data['items'].map do |item|
 l {type: 'foursquare',
 lat: item['location']['lat'],
 lng: item['location']['lng']}
 end
 connection.close
 end
 
 four_square.errback do
 puts "Foursquare Query Failed"
 connection.close
 end

  13. ExceptionalSynchrony::CallbackExceptions EventMachine callback + errback is confusing and not DRY.

    four_square.callback do
 data = JSON.parse(four_sq.response)['response']['groups'].first
 locations =
 data['items'].map do |item|
 l {type: 'foursquare',
 lat: item['location']['lat'],
 lng: item['location']['lng']}
 end
 connection.close
 end
 
 four_square.errback do
 puts "Foursquare Query Failed"
 connection.close
 end

  14. ExceptionalSynchrony::CallbackExceptions Synchrony unifies the return path: def foursquare_query(lat, long)
 request

    = EM::HttpRequest.new('https://api.foursquare.com/v2/venues/search')
 http_response = request.get(query: four_square_query)
 
 if http_response.status == 0
 # handle timeout exception
 else
 data = JSON.parse(http_response.response)['response']['groups'].first
 result = data['items'].map do |item|
 {
 type: 'foursquare',
 lat: item['location']['lat'],
 lng: item['location']['lng']
 }
 end
 end
 connection.close
 result
 end
  15. ExceptionalSynchrony::CallbackExceptions module ExceptionalSynchrony
 module CallbackExceptions
 class << self
 def ensure_callback(deferrable,

    &block)
 result = return_exception(&block)
 deferrable.succeed(*Array(result))
 end
 
 private
 
 def return_exception
 begin
 yield
 rescue Exception => ex
 ex
 end
 end
 end
 end

  16. ExceptionalSynchrony::CallbackExceptions CallbackExceptions.map_deferred_result raises the exceptions: def foursquare_query(lat, long)
 request =

    EM::HttpRequest.new(‘https://api.foursquare.com/v2/venues/search') 
 http_response = ExceptionalSynchrony::CallbackExceptions.map_deferred_result(
 request.get(:query => four_sq_query)) 
 data = JSON.parse(http_request.response)['response']['groups'].first
 data['items'].map do |item|
 {
 type: 'foursquare',
 lat: item[‘location']['lat'],
 lng: item['location']['lng']
 }
 end
 ensure
 connection.close
 end
  17. ExceptionalSynchrony::EventMachineProxy If an exception gets loose when EventMachine or a

    Synchrony fiber calls into your code, your process will … exit!
  18. ExceptionalSynchrony::EventMachineProxy module ExceptionalSynchrony
 class EventMachineProxy 
 def add_timer(seconds)
 EM.add_timer(seconds) do


    begin
 yield
 rescue Exception => ex
 @logger.log_error(“add_timer rescued #{ex.class}: #{ex}”)
 end
 end
 end
 ...
 end
 end

  19. ExceptionalSynchrony::ParallelSync def process_bulk_request(bulk_request)
 local_requests, remote_requests = parse_bulk_request_by_shard(bulk_request)
 
 responses =


    ExceptionalSynchrony::ParallelSync.parallel do |parallel| 
 parallel.add { @local_manager.bulk_request(local_requests) } 
 remote_requests.map do |domain, remote_request|
 parallel.add { remote_manager_for(domain).bulk_request(remote_request) }
 end
 end # <——- parallel results rendezvous here
 
 BulkResponse.new(responses)
 end

  20. class BulkResponse
 attr_reader :responses
 
 def initialize(responses, format = :json)

    [:json, :xml].include?(format) or raise “Unknown format #{format.inspect}”
 @responses = responses
 @format = format
 end
 
 def to_json
 {responses: responses}.to_json
 end
 
 class << self
 def from_json(json, requests)
 hash = Util.parse_json(json)
 responses = hash[:responses] or raise "JSON for (#{json}) contains no responses!"
 new(responses.map { |response| Response.from_hash(response) } )
 end
 end
 end Immutable Value Classes
  21. class BulkResponse
 attr_reader :responses
 
 def initialize(responses, format = :json)

    [:json, :xml].include?(format) or raise “Unknown format #{format.inspect}”
 @responses = responses
 @format = format
 end
 
 def to_json
 {responses: responses}.to_json
 end
 
 class << self
 def from_json(json, requests)
 hash = Util.parse_json(json)
 responses = hash[:responses] or raise "JSON for (#{json}) contains no responses!"
 new(responses.map { |response| Response.from_hash(response) } )
 end
 end
 end Immutable Value Classes
  22. class BulkResponse
 attr_reader :responses
 
 def initialize(responses, format = :json)

    [:json, :xml].include?(format) or raise “Unknown format #{format.inspect}”
 @responses = responses
 @format = format
 end
 
 def to_json
 {responses: responses}.to_json
 end
 
 class << self
 def from_json(json, requests)
 hash = Util.parse_json(json)
 responses = hash[:responses] or raise "JSON for (#{json}) contains no responses!"
 new(responses.map { |response| Response.from_hash(response) } )
 end
 end
 end Immutable Value Classes
  23. class BulkResponse
 attr_reader :responses
 
 def initialize(responses, format = :json)

    [:json, :xml].include?(format) or raise “Unknown format #{format.inspect}”
 @responses = responses
 @format = format
 end
 
 def to_json
 {responses: responses}.to_json
 end
 
 class << self
 def from_json(json, requests)
 hash = Util.parse_json(json)
 responses = hash[:responses] or raise "JSON for (#{json}) contains no responses!"
 new(responses.map { |response| Response.from_hash(response) } )
 end
 end
 end Immutable Value Classes
  24. class BulkResponse
 attr_reader :responses
 
 def initialize(responses, format = :json)

    [:json, :xml].include?(format) or raise “Unknown format #{format.inspect}”
 @responses = responses
 @format = format
 end
 
 def to_json
 {responses: responses}.to_json
 end
 
 class << self
 def from_json(json)
 hash = Util.parse_json(json)
 responses = hash[:responses] or raise "JSON for (#{json}) contains no responses!"
 new(responses.map { |response| Response.from_hash(response) } )
 end
 end
 end Immutable Value Classes
  25. class BulkResponse
 attr_reader :responses
 
 def initialize(responses, format = :json)

    [:json, :xml].include?(format) or raise “Unknown format #{format.inspect}”
 @responses = responses
 @format = format
 end
 
 def to_json
 {responses: responses}.to_json
 end
 
 class << self
 def from_json(json, requests)
 hash = Util.parse_json(json)
 responses = hash[:responses] or raise "JSON for (#{json}) contains no responses!"
 new(responses.map { |response| Response.from_hash(response) } )
 end
 end
 end Immutable Value Classes
  26. Sandi Metz’s “Rules” 1. Classes should be no longer than

    100 lines of code 2. Methods should be no longer than 5 lines of code. 3. Methods should have no more than 4 parameters. Options hashes count. 4. …
  27. Singletons class Secrets
 def initialize
 @hash = YAML.load_file("./config/secrets.yml")
 end
 


    def [](key)
 @hash[key]
 end
 
 ...
 class << self
 def instance
 @instance ||= new
 end
 end
 end

  28. Singletons TL;DR: they are evil! Can’t control their lifetime Can’t

    pass constructor parameters Can’t reuse them Can’t easily test them Singletons beget more singletons!
  29. Modified Singleton class Secrets
 def initialize(secrets_path)
 @hash = YAML.load_file(secrets_path)
 end


    
 def [](key)
 @hash[key]
 end
 
 def set_instance
 self.class.instance = self
 end
 
 ... 
 class << self
 attr_accessor :instance
 end
 end
  30. Modified Singleton class Secrets
 def initialize(secrets_path)
 @hash = YAML.load_file(secrets_path)
 end


    
 def [](key)
 @hash[key]
 end
 
 def set_instance
 self.class.instance = self
 end
 
 ... 
 class << self
 attr_accessor :instance
 end
 end
  31. Modified Singleton class Secrets
 def initialize(secrets_path)
 @hash = YAML.load_file(secrets_path)
 end


    
 def [](key)
 @hash[key]
 end
 
 def set_instance
 self.class.instance = self
 end
 
 ... 
 class << self
 attr_accessor :instance
 end
 end
  32. Success? Median 90% Requests / Sec 3 jMeter instances 93ms

    124ms 1411 4 jMeter instances 102ms 144ms 1719 5 jMeter instances 111ms 160ms 2021
  33. Tools • Minitest, FactoryGirl, Simplecov • Entire test suite runs

    in < 10s, 99.96% coverage • Apache jMeter for load testing
  34. Shout-outs: • Special thanks to: • Maintainers of EM and

    EM::Synchrony. • IRC: #ruby, #rubylang, #rubyonrails (Freenode) • GoogleLabs, Ilya Grigorik • Sandi Metz
  35. Fin. @rubious_dan @ColinDKelley Invoca, Inc. Santa Barbara CA and now

    Boulder, CO Github Repos: invoca/railsconf2015 invoca/exceptional_synchrony