High Performance APIs in Ruby using ActiveRecord and Goliath+Synchrony+EventMachine

High Performance APIs in Ruby using ActiveRecord and Goliath+Synchrony+EventMachine

We had a real-time API in Rails that needed much lower latency and massive throughput. We wanted to preserve our investment in business logic inside ActiveRecord models while scaling up to 1000X throughput and cutting latency in half. Conventional wisdom would say this was impossible in Ruby, but we succeeded and actually surpassed our expectations. We'll discuss how we did it, using EventMachine for the reactor pattern, Synchrony to avoid callback hell and to make testing easy, Goliath as the non-blocking web server, and sharding across many cooperative processes.
Presented by Dan Kozlowski @rubious_dan and Colin Kelley @colindkelley
Demo repo: https://github.com/Invoca/railsconf2015

A0f235e888c375bf9b4210d3e30ae3ce?s=128

Colin Kelley

April 22, 2015
Tweet

Transcript

  1. Dan Kozlowski (@rubious_dan) and Colin Kelley (@ColinDKelley) Invoca High performance

    APIs in Ruby using ActiveRecord and Goliath
  2. Preface: The RingPool API

  3. The RingPool API { 
 “Search” : “toyoto santa barbara”,


    “VIN” : “SJKHSDJFHKJHSDFK”,
 “Model” : “Toyota Camry”
 } { “phone_number” : “(800) 555-1234” }
  4. The RingPool API RingPools contain 2-500 phone numbers Phone numbers

    are allocated (locked-in) for a period of time. No phone numbers available? Every RingPool has an overflow number. Overflow numbers provide no attribution data
  5. Chapter 1: Growing

  6. Requests per second PN per request Response Time (90%) Existing

    API 5 1 ~350ms Client Requirements >=1000 >=40 <200ms
  7. 8000x needed increase in current throughput

  8. The Requirements • Predictable low latency • Elastically scalable •

    Can’t swamp FEs (asynchronous, limited, queued) • Extra credit if the solution is in Ruby
  9. Chapter 2: The Stack

  10. The GIL Global Interpreter Lock: only one thread can run

    per CPU at a time. Implemented in MRI Ruby because: ✅ Simplifies interpreters’ design. ✅ Prevents mutation of shared data. Prevents true parallelism.
  11. 0% 25% 50% 75% 100% CPU 1 CPU 2 CPU

    3 CPU 4 No GIL: Sharing cores is caring!
  12. 0% 25% 50% 75% 100% CPU 1 CPU 2 CPU

    3 CPU 4 GIL processes are greedy.
  13. 0% 25% 50% 75% 100% CPUs Multiple Ruby Processes sharing

    the load
  14. Reactor Pattern TL;DR: Don’t block the reactor. Actions that would

    block are performed later by asynchronous callback. Example: HTTP GET, MySQL INSERT, etc. Not a new idea: libraries available in Javascript, Python, Ruby…
  15. None
  16. None
  17. And the callback goes to…

  18. EM::Synchrony No callbacks. When code blocks reactor: 1. State is

    stored in the Fiber’s stack 2. Reactor proceeds to other work 3. When unblocked, the Fiber is resumed Linear code!
  19. None
  20. None
  21. Enter Goliath!

  22. None
  23. Chapter 3: Demo!

  24. Chapter 4: Software Design

  25. ActiveRecord and Synchrony ActiveRecord and mysql2 are supported in EM::Synchrony

    So are Postgres, Mongo, Redis, Memcache, AMQP, …
  26. Depency Hell class User < ActiveRecord::Base
 belongs_to :account # this

    is OK 
 validates :email_address, length: { maximum: EmailAddress::NAME_LEN }
 ...
 end
 
 class EmailAddress
 def self.create_new(name_addr, default_domain = Network::DEFAULT_DOMAIN)
 ...
 end
 end class Network
 ...
  27. ActiveRecord CPU If you want maximum density in your API

    server, beware of ActiveRecord CPU hogs: ActiveRecord object loads Extensive validations and other callbacks
  28. ExceptionalSynchrony Gem that provides a few shims for EM::Synchrony: CallbackExceptions

    EventMachineProxy ParallelSync
  29. ExceptionalSynchrony::CallbackExceptions EventMachine callback + errback is confusing and not DRY.

    four_square.callback do
 data = JSON.parse(four_sq.response)['response']['groups'].first
 locations =
 data['items'].map do |item|
 l {type: 'foursquare',
 lat: item['location']['lat'],
 lng: item['location']['lng']}
 end
 connection.close
 end
 
 four_square.errback do
 puts "Foursquare Query Failed"
 connection.close
 end

  30. ExceptionalSynchrony::CallbackExceptions EventMachine callback + errback is confusing and not DRY.

    four_square.callback do
 data = JSON.parse(four_sq.response)['response']['groups'].first
 locations =
 data['items'].map do |item|
 l {type: 'foursquare',
 lat: item['location']['lat'],
 lng: item['location']['lng']}
 end
 connection.close
 end
 
 four_square.errback do
 puts "Foursquare Query Failed"
 connection.close
 end

  31. ExceptionalSynchrony::CallbackExceptions Synchrony unifies the return path: def foursquare_query(lat, long)
 request

    = EM::HttpRequest.new('https://api.foursquare.com/v2/venues/search')
 http_response = request.get(query: four_square_query)
 
 if http_response.status == 0
 # handle timeout exception
 else
 data = JSON.parse(http_response.response)['response']['groups'].first
 result = data['items'].map do |item|
 {
 type: 'foursquare',
 lat: item['location']['lat'],
 lng: item['location']['lng']
 }
 end
 end
 connection.close
 result
 end
  32. ExceptionalSynchrony::CallbackExceptions module ExceptionalSynchrony
 module CallbackExceptions
 class << self
 def ensure_callback(deferrable,

    &block)
 result = return_exception(&block)
 deferrable.succeed(*Array(result))
 end
 
 private
 
 def return_exception
 begin
 yield
 rescue Exception => ex
 ex
 end
 end
 end
 end

  33. ExceptionalSynchrony::CallbackExceptions CallbackExceptions.map_deferred_result raises the exceptions: def foursquare_query(lat, long)
 request =

    EM::HttpRequest.new(‘https://api.foursquare.com/v2/venues/search') 
 http_response = ExceptionalSynchrony::CallbackExceptions.map_deferred_result(
 request.get(:query => four_sq_query)) 
 data = JSON.parse(http_request.response)['response']['groups'].first
 data['items'].map do |item|
 {
 type: 'foursquare',
 lat: item[‘location']['lat'],
 lng: item['location']['lng']
 }
 end
 ensure
 connection.close
 end
  34. ExceptionalSynchrony::EventMachineProxy If an exception gets loose when EventMachine or a

    Synchrony fiber calls into your code, your process will … exit!
  35. ExceptionalSynchrony::EventMachineProxy module ExceptionalSynchrony
 class EventMachineProxy 
 def add_timer(seconds)
 EM.add_timer(seconds) do


    begin
 yield
 rescue Exception => ex
 @logger.log_error(“add_timer rescued #{ex.class}: #{ex}”)
 end
 end
 end
 ...
 end
 end

  36. ExceptionalSynchrony::EventMachineProxy Exception-safe wrapped methods: add_timer add_periodic_timer sleep connect next_tick run

    stop
  37. ExceptionalSynchrony::ParallelSync def process_bulk_request(bulk_request)
 local_requests, remote_requests = parse_bulk_request_by_shard(bulk_request)
 
 responses =


    ExceptionalSynchrony::ParallelSync.parallel do |parallel| 
 parallel.add { @local_manager.bulk_request(local_requests) } 
 remote_requests.map do |domain, remote_request|
 parallel.add { remote_manager_for(domain).bulk_request(remote_request) }
 end
 end # <——- parallel results rendezvous here
 
 BulkResponse.new(responses)
 end

  38. class BulkResponse
 attr_reader :responses
 
 def initialize(responses, format = :json)

    [:json, :xml].include?(format) or raise “Unknown format #{format.inspect}”
 @responses = responses
 @format = format
 end
 
 def to_json
 {responses: responses}.to_json
 end
 
 class << self
 def from_json(json, requests)
 hash = Util.parse_json(json)
 responses = hash[:responses] or raise "JSON for (#{json}) contains no responses!"
 new(responses.map { |response| Response.from_hash(response) } )
 end
 end
 end Immutable Value Classes
  39. class BulkResponse
 attr_reader :responses
 
 def initialize(responses, format = :json)

    [:json, :xml].include?(format) or raise “Unknown format #{format.inspect}”
 @responses = responses
 @format = format
 end
 
 def to_json
 {responses: responses}.to_json
 end
 
 class << self
 def from_json(json, requests)
 hash = Util.parse_json(json)
 responses = hash[:responses] or raise "JSON for (#{json}) contains no responses!"
 new(responses.map { |response| Response.from_hash(response) } )
 end
 end
 end Immutable Value Classes
  40. class BulkResponse
 attr_reader :responses
 
 def initialize(responses, format = :json)

    [:json, :xml].include?(format) or raise “Unknown format #{format.inspect}”
 @responses = responses
 @format = format
 end
 
 def to_json
 {responses: responses}.to_json
 end
 
 class << self
 def from_json(json, requests)
 hash = Util.parse_json(json)
 responses = hash[:responses] or raise "JSON for (#{json}) contains no responses!"
 new(responses.map { |response| Response.from_hash(response) } )
 end
 end
 end Immutable Value Classes
  41. class BulkResponse
 attr_reader :responses
 
 def initialize(responses, format = :json)

    [:json, :xml].include?(format) or raise “Unknown format #{format.inspect}”
 @responses = responses
 @format = format
 end
 
 def to_json
 {responses: responses}.to_json
 end
 
 class << self
 def from_json(json, requests)
 hash = Util.parse_json(json)
 responses = hash[:responses] or raise "JSON for (#{json}) contains no responses!"
 new(responses.map { |response| Response.from_hash(response) } )
 end
 end
 end Immutable Value Classes
  42. class BulkResponse
 attr_reader :responses
 
 def initialize(responses, format = :json)

    [:json, :xml].include?(format) or raise “Unknown format #{format.inspect}”
 @responses = responses
 @format = format
 end
 
 def to_json
 {responses: responses}.to_json
 end
 
 class << self
 def from_json(json)
 hash = Util.parse_json(json)
 responses = hash[:responses] or raise "JSON for (#{json}) contains no responses!"
 new(responses.map { |response| Response.from_hash(response) } )
 end
 end
 end Immutable Value Classes
  43. class BulkResponse
 attr_reader :responses
 
 def initialize(responses, format = :json)

    [:json, :xml].include?(format) or raise “Unknown format #{format.inspect}”
 @responses = responses
 @format = format
 end
 
 def to_json
 {responses: responses}.to_json
 end
 
 class << self
 def from_json(json, requests)
 hash = Util.parse_json(json)
 responses = hash[:responses] or raise "JSON for (#{json}) contains no responses!"
 new(responses.map { |response| Response.from_hash(response) } )
 end
 end
 end Immutable Value Classes
  44. Sandi Metz’s “Rules” 1. Classes should be no longer than

    100 lines of code 2. Methods should be no longer than 5 lines of code. 3. Methods should have no more than 4 parameters. Options hashes count. 4. …
  45. Singletons class Secrets
 def initialize
 @hash = YAML.load_file("./config/secrets.yml")
 end
 


    def [](key)
 @hash[key]
 end
 
 ...
 class << self
 def instance
 @instance ||= new
 end
 end
 end

  46. Singletons TL;DR: they are evil! Can’t control their lifetime Can’t

    pass constructor parameters Can’t reuse them Can’t easily test them Singletons beget more singletons!
  47. None
  48. Modified Singleton class Secrets
 def initialize(secrets_path)
 @hash = YAML.load_file(secrets_path)
 end


    
 def [](key)
 @hash[key]
 end
 
 def set_instance
 self.class.instance = self
 end
 
 ... 
 class << self
 attr_accessor :instance
 end
 end
  49. Modified Singleton class Secrets
 def initialize(secrets_path)
 @hash = YAML.load_file(secrets_path)
 end


    
 def [](key)
 @hash[key]
 end
 
 def set_instance
 self.class.instance = self
 end
 
 ... 
 class << self
 attr_accessor :instance
 end
 end
  50. Modified Singleton class Secrets
 def initialize(secrets_path)
 @hash = YAML.load_file(secrets_path)
 end


    
 def [](key)
 @hash[key]
 end
 
 def set_instance
 self.class.instance = self
 end
 
 ... 
 class << self
 attr_accessor :instance
 end
 end
  51. Chapter 5: Architecture

  52. None
  53. HTTP 1.1P HTTP 1.0

  54. None
  55. None
  56. None
  57. None
  58. Success? Median 90% Requests / Sec 3 jMeter instances 93ms

    124ms 1411 4 jMeter instances 102ms 144ms 1719 5 jMeter instances 111ms 160ms 2021
  59. Tools • Minitest, FactoryGirl, Simplecov • Entire test suite runs

    in < 10s, 99.96% coverage • Apache jMeter for load testing
  60. Shout-outs: • Special thanks to: • Maintainers of EM and

    EM::Synchrony. • IRC: #ruby, #rubylang, #rubyonrails (Freenode) • GoogleLabs, Ilya Grigorik • Sandi Metz
  61. Call to Action Try the Goliath stack for your next

    API server!
  62. Fin. @rubious_dan @ColinDKelley Invoca, Inc. Santa Barbara CA and now

    Boulder, CO Github Repos: invoca/railsconf2015 invoca/exceptional_synchrony