Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Off The Rails

Sam
June 05, 2015

Off The Rails

Red Dot Ruby Conference Slides June 2015

Sam

June 05, 2015
Tweet

More Decks by Sam

Other Decks in Technology

Transcript

  1. Off  The  Rails
    Sam  Saffron  

    View Slide

  2. About  me
    •  Discourse  co-­‐founder  
    •  Full  4me  open  source  developer  
    •  Remote  worker  
    •  Gem  author  
    •  Performance  enthusiast  
    •  Stack  Overflow  employee  #8  
    •  samsaffron.com  
    •  @samsaffron    

    View Slide

  3. View Slide

  4. View Slide

  5. View Slide

  6. View Slide

  7. View Slide

  8. View Slide

  9. Omakase  v2
    Free  
    Performance  
     
    fast_xor  
    fast_xs  
    fast_blank  
    hiredis  
    Caching  
     
    message_bus  
    Anonymous  cache  
    NGINX  
    Profiling  and  
    Diagnos4cs  
     
    rack_mini_profiler  
    flamegraph  
    rbtrace  
    Development  
     
    logster  
    pry  
    beOer_errors  +  
    binding_of_caller  
    rack-­‐mini-­‐profiler  
    fast  assets  
    rake  autospec  
    Live  reload  CSS  
     
    Messaging  
     
    message_bus  
    Logging  
     
    logster  
    Serializa4on  
     
    ac4ve_model_serializers  
    oj  

    View Slide

  10. Omakase  Demo

    View Slide

  11. gem  install  message_bus

    View Slide

  12. MessageBus.subscribe("/chat")  do  |msg|    
       pp  msg    
    end    
     
    MessageBus.publish  "/chat",  ["hello","there"]    
    #  global_id=1  
    #  message_id=1  
    #  channel="/chat"  
    #  data=["hello",  "there"]    

    View Slide

  13. MessageBus.user_id_lookup  do  |env|  
       lookup_user_id(env)  
    end    
     
    MessageBus.publish  "/users/1",  
    ["hello","there"],  user_id:  1    
    //    
    MessageBus.subscribe("/users/1",  function(msg){  
       console.log(msg);    
    });  
    Ruby  
    JavaScript  

    View Slide

  14. message_bus  features
    • Long  polling  /  polling  
    • Security  (group  /  user)  
    • Reliable  playback  (catch  up)  
    • Efficient  transport  (mul4plexer)  
    • Mul4site  support  

    View Slide

  15. Rack  Hijack  works  on  unicorn/puma/
    passenger
    run  lambda{|env|  
       io  =  env['rack.hijack'].call  
       Thread.new  do  
           sleep  1    
           io.write  "HTTP/1.1  200\r\n"    
           io.write  "Connection:  close\r\n"  
           io.write  "Content-­‐Length:  2\r\n"    
           io.write  "\r\n"    
           io.write  "OK"    
           io.close    
       end  [418,{},"NOT  DEFINED  IN  SPEC"]    
    }  
    %  ab  -­‐c  100  -­‐n  100  hOp://localhost:8080/  
    …  
    Percentage  of  the  requests  served  within  a  certain  4me  
       50%      1076  
     100%      1084  (longest  request)  
    100  concurrent  requests  
    All  sleep  for  1  second  
    Served  with  1.08  seconds  
     
    Unicorn/Puma  or  Passenger  

    View Slide

  16. message_bus,  ready  for  producEon
    • Uses  rack.hijack  (and  thin.async  for  thin)  
    • In  produc4on  in  Discourse  for  2+  years  
    • Minimal  dependencies  (redis  and  rack  only)  
    • can  be  ported  to  pg  or  memory  
    • Runs  inside  your  Rails  app  as  middleware,  no  need  for  extra  
    ports  /  apps  

    View Slide

  17. What  about  AcEon  Cable?
    •  No  code  available  to  review,  but  …  many  open  ques4ons  
    •  Websockets  only  ?!    
    •  Reliable  message  ordering  ?!  Ability  to  catch  up  ?!  
    •  Event  Machine?!  Celluloid?!    

    View Slide

  18. What  about  web  sockets?

    View Slide

  19. What  about  web  sockets?
    •  Ini4al  version  supported  it  
    •  Reliable  pub/sub  s4ll  required  
    •  Fallback  logic  s4ll  required  (6  connec4on  per  browser,  less  on  
    phones)  
    •  HTTPS  required  
    •  HAProxy  hacks  may  be  required  
    •  Hard  to  debug  
    •  Not  significantly  beOer  than  long  polling  
    •  PR  welcome  

    View Slide

  20. QuesEons?

    View Slide

  21. hJp://chat.samsaffron.com

    View Slide

  22. gem  install  lru_redux

    View Slide

  23. cache  =  LruRedux::Cache.new(2)  
    cache[:a]  =  "1"  
    cache[:b]  =  "2"    
    cache[:c]  =  "3"    
    p  cache.to_a    
     
    [[:c,  "3"],  [:b,  "2"]]  

    View Slide

  24. lru_redux
    • LruRedux::TTL::Cache  –  Cache  with  4me-­‐to-­‐live  
    • Thread  safe  versions:  LruRedux::ThreadSafeCache  etc.  
    • Fastest  exis4ng  TTL  and  LRU  cache  for  Ruby  
    • Ruby  1.9  and  up  due  to  ordered  seman4cs  in  Hash  

    View Slide

  25. Now  let’s  splash  in  some  message_bus

    View Slide

  26. #  on  server  1    
    @cache  =  DistributedCache.new("site_customization")    
     
    #  on  server  2    
    @cache  =  DistributedCache.new("site_customization")    
    @cache["foo"]  =  "bar"    
     
    #  on  server  1    
     
    puts  @cache["foo"]    
    #  "bar"  

    View Slide

  27. Performance  is  a  feature
    • Discourse  op4mizes  for  adop4on  
    • Needs  to  be  fast  on  a  $10  a  month  vps  
    • Constant  profiling  and  tuning  is  also  a  feature    

    View Slide

  28. gem  install  memory_profiler

    View Slide

  29. memory_profiler  crash  course
    require  'memory_profiler'  
     
    ENV['RAILS_ENV']="production"  
    MemoryProfiler.report  do  
       require  '/Users/sam/Source/discourse/config/environment'  
       I18n.t(:posts)  
       Rails.application.routes.recognize_path('abc')  rescue  nil  
       User.first  
    end.pretty_print  
    Measure  memory  
    usage  for  boot  

    View Slide

  30. AllocaEon  reports
    • Directly  relates  to  overall  performance  
    • May  have  minimal  impact  on  overall  process  memory    

    View Slide

  31. Discourse  boot  (allocated)
    allocated  memory  by  gem  
    -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐  
     110297190    activesupport-­‐4.1.10  
       30253712    actionpack-­‐4.1.10  
       30242485    rubygems  
       17451588    2.2.2/lib  
       16160100    activerecord-­‐4.1.10  
       11930648    mime-­‐types-­‐1.25.1  
         9958591    bundler-­‐1.10.2  
         6640523    pg-­‐0.18.1  
    allocated  memory  by  file  
    -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐  
       89200261    active_support/dependencies.rb  
       12401276    rubygems/specification.rb  
       11930648    mime-­‐types-­‐1.25.1/lib/mime/types.rb  
         9905902    kernel_require.rb  
         6544073    pg-­‐0.18.1/lib/pg/basic_type_mapping.rb  
         5681212    action_dispatch/routing/mapper.rb  
         4665127    journey/gtg/builder.rb  
         4417058  action_dispatch/routing/route_set.rb  
         3309612  rubygems/stub_specification.rb  
         3301511    active_support/core_ext/class/attribute.rb  
           

    View Slide

  32. Discourse  boot  strings  (allocated)
    40833    "\n"  
             22950    ac4ve_support/dependencies.rb:247  
               3418    rubygems/core_ext/kernel_require.rb:54  
               …  
    16423    ""  
               3421    ac4on_dispatch/journey/nodes/node.rb:33  
               1643    mime/types.rb:303  
               1643    mime-­‐types-­‐1.25.1/lib/mime/types.rb:304  
                 ….  
    13896    "rake"  
             13588    bundler/spec_set.rb:111  
               ….  
     10731    "applica4on"  
               3558    mime-­‐types-­‐1.25.1/lib/mime/types.rb:303  
               1186    mime-­‐types-­‐1.25.1/lib/mime/types.rb:427  
               1186    mime-­‐types-­‐1.25.1/lib/mime/types.rb:304  
               1186    mime-­‐types-­‐1.25.1/lib/mime/types.rb:426  
     
    3157    "ruby"  
               1476    bundler/spec_set.rb:134  
               1151    rubygems/specifica4on.rb:1850  
                 276    rubygems/stub_specifica4on.rb:21  
                 184    bundler-­‐1.10.2/lib/bundler/index.rb:71  
    The  String  “\n”  is  allocated  40833  4mes  on  boot!  

    View Slide

  33. RetenEon  reports
    • How  many  objects  are  leu  auer  block  executes?  
    • Directly  relates  to  GC  performance  
    • Directly  relates  to  memory  usage  

    View Slide

  34. Discourse  boot  (retained)
    retained  memory  by  gem  
    -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐  
       32882335    activesupport-­‐4.1.10  
         6337743    actionpack-­‐4.1.10  
         5744771    activerecord-­‐4.1.10  
         5223057    rubygems  
         3797283    2.2.2/lib  
         2431896    mime-­‐types-­‐1.25.1  
         2428612    bundler-­‐1.10.2  
         2225017    message_bus-­‐1.0.12  
         2175834    discourse/lib  
    Careful  threads  are  expensive  
     
    retained  memory  by  location  
    -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐  
       27361858    active_support/dependencies.rb:247  
         3778810    rubygems/core_ext/kernel_require.rb:54  
         1050232    lib/discourse.rb:335  
         1050232    message_bus-­‐1.0.12/lib/message_bus.rb:374  
         1050232    message_bus-­‐1.0.12/lib/message_bus/
    timer_thread.rb:21  
           715906    bundler-­‐1.10.2/lib/bundler/runtime.rb:76  
           641448    active_support/core_ext/class/attribute.rb:8
           607917    active_support/core_ext/module/delegation.rb
           434136    mime-­‐types-­‐1.25.1/lib/mime/types.rb:820  

    View Slide

  35. Discourse  boot  (retained)
    retained  objects  by  gem  
    -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐  
             69380    activesupport-­‐4.1.10  
             37701    actionpack-­‐4.1.10  
             25410    mime-­‐types-­‐1.25.1  
             21817    rubygems  
             13889    activerecord-­‐4.1.10  
             13489    2.2.2/lib  
               9111    bundler-­‐1.10.2  
               6441    2.2.0  
               5396    discourse/lib  
               4763    tzinfo-­‐1.2.2  
               4044    
    active_model_serializers-­‐0.8.3  
               3487    railties-­‐4.1.10  
               3079    pg-­‐0.18.1  
    retained  objects  by  class  
    -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐  
           124694    String  
             33655    RubyVM::InstructionSequence  
             26591    Array  
               9565    Hash  
               9279    Proc  
               9202    RubyVM::Env  
               3739    ActionDispatch::Journey::Nodes::Cat  
               3575    Class  
               3291    Regexp  
               3188    Symbol  
               1643    MIME::Type  
               1592    Gem::Requirement  
               1589    ActionDispatch::Journey::Nodes::Slash  
               1265    Module  
               1195    ActionDispatch::Journey::Nodes::Literal  
               1044    Gem::Dependency  

    View Slide

  36. Using  memory  profiler  to  opEmise  pluck
    #  CREATE  TABLE  products(id  int,  in_stock  boolean,  price  int,  tax  
    int)    
    #  ActiveRecord  version  4.2.1  
    ActiveRecord::Base.establish_connection({  
       adapter:  'postgresql',  
       database:  'test'    
    })    
     
    MemoryProfiler.report  do  
         Product.limit(10).pluck(:price,  :tax)    
    end.pretty_print  
    286  objects  allocated  
    25K  byte  allocated  

    View Slide

  37. InteresEng  data  points
    • The  string  “Product”  allocated  28  4mes  
    • Numbers  are  allocated  as  string  
    • String  “price”  allocated  3  4mes  
    • Ac4veRecord::Result  allocates  4K  

    View Slide

  38. What  if  we  had  no  AcEveRecord?
    MemoryProfiler.report  do  
       raw_connection.exec("SELECT  price,  tax  FROM  products  LIMIT  10").values  
                       .map{|row|  row.map{|col|  col.to_i}}  
    end.pretty_print    
    44  objects  allocated  (was  286)  
    3.7K  bytes  allocated  (was  25K)  
    21  Strings  allocated,  but  we  are  only  selec4ng  numbers,  WHY?  

    View Slide

  39. PG  type  mapping,  new  in  pg  0.18
    type_map  =  PG::BasicTypeMapForResults.new(raw_connection)  
     
    MemoryProfiler.report  do    
         result  =  raw_connection.exec("SELECT  price,  tax  FROM  
    products  LIMIT  10")  
         result.type_map  =  type_map  
         result.values  
    end.pretty_print  
    13  objects  allocated  (was  286)  
    1.1K  bytes  allocated  (was  25K)  
    Shout  out  to  Lars  Kanis  for  building  this    

    View Slide

  40. BackporEng  Fast  Pluck  into  Rails  4
    • 87  lines  of  code  at:  discourse/lib/freedom_patches/
    fast_pluck.rb  
    • 100%  backwards  compat  (works  on  Rails  4.1  /  4.2)  
    • Uses  new  PG  type  map  
    • Reduce  alloca4ons  from  286  to  198  
    • Reduce  memory  allocated  from  25K  to  18K  
    • Will  not  be  backported  into  Rails  4.2  

    View Slide

  41. Fast  pluck  vs  pluck
    19200  
    16900  
    7850  
    1350  
    17649  
    12584  
    3468  
    440  
    10   100   1000   10000  
    Pluck  vs  Fast  Pluck  (higher  is  be?er)  
    Fast  Pluck  (ops/sec)   Pluck  
    2  4mes  faster    
    1000  rows  

    View Slide

  42. The  compelling  sell  of  AcEveRecord
    cars  =  Car.all  
    cars  =  cars.where(color:  color)  if  color    
    cars  =  cars.where('max_speed  >  ?',  max_speed)  if  max_speed  
    cars  =  cars.select('make,  max_speed')    
    cars.each  do  |car|  
       puts  "make:  #{car.make}  max_speed:  #{car.max_speed}"    
    end  

    View Slide

  43. AcEveRecord
    Simple  
    Elegant  
    Flexible  
    Inefficient  
    Not  SQL  

    View Slide

  44. Can  we  do  the  same  by  hand?
    sql  =  "select  *  from  cars  "  
    and_or_where  =  "where"    
    if  color  
       sql  <<  "where  color  =  '#{PG::Connection.escape(color)}'"    
       and_or_where  =  "and"    
    end    
    sql  <<  "#{and_or_where}  max_speed  =  '#{max_speed}'"  if  max_speed  
    connection.exec(sql).each  |row|    
       puts  "make:  #{row["make"]}  max_speed:  #{row["max_speed"]}"    
    end  

    View Slide

  45. Doing  it  Wrong™
    Incomprehensible  
    FAST  
    Risky  
    Yes  SQL  

    View Slide

  46. SqlBuilder  a  sane  alternaEve
    builder  =  SqlBuilder.new("select  *  from  cars  /*where*/")    
    builder.where("color  =  :color",  color:  color)  if  color    
    builder.where("max_speed  =  :max_speed",    
                                 max_speed:  max_speed)  if  max_speed  
    builder.map_exec(Car).each  do  |row|    
       puts  "make:  #{row.make}  max_speed:  #{row.max_speed}"    
    end  

    View Slide

  47. SqlBuilder
    Yes  SQL  
    Fast  
    Sane  
    Tiny  –  approx  120  lines  of  code  
    discourse/lib/sql_builder.rb  

    View Slide

  48. How  fast  is  this?
    0  
    5  
    10  
    15  
    20  
    25  
    30  
    1  Row   100  Rows   1000  Rows  
    Raw  
    SqlBuilder  
    Ac4ve  Record  
    hOps://gist.github.com/SamSaffron/9077b632475a4fe0d57b   K  opera4on/sec  (2  columns  per  row)  

    View Slide

  49. A  story  about  Stack  Overflow

    View Slide

  50. “As  soon  I  started  measuring  I  no4ced  that  even  
    though  the  SQL  for  this  takes  12  milliseconds,  the  
    total  4me  it  takes  to  execute  the  above  code  is  
    much  higher,  profiling  shows  a  90  ms  execuHon  
    Hme.”  
    …  so  we  created  Dapper  …  

    View Slide

  51. View Slide

  52. What  is  Dapper?
    public  class  Dog  
    {  
           public  int?  Age  {  get;  set;  }  
           public  Guid  Id  {  get;  set;  }  
           public  string  Name  {  get;  set;  }  
           public  float?  Weight  {  get;  set;  }  
           public  int  IgnoredProperty  {  get  {  return  1;  }  }  
    }                          
     
    var  guid  =  Guid.NewGuid();  
    var  dog  =  connec4on.Query(  
                           "select  Age  =  @Age,  Id  =  @Id",    
                               new  {  Age  =  (int?)null,  Id  =  guid  });  

    View Slide

  53. AcEve  Record  needs  Dapper
    • A  simple  “ultra  efficient”  standalone  SQL  -­‐>  object  
    mapper  
    • Small  code  base  
    • Standalone  gem  
    • Handle  parameters    
    • Interoperable  with  Rails  
    • All  queries  run  through  object  mapper  
    • Provide  a  “blessed”  efficient  SQL  story  

    View Slide

  54. How  this  could  look?
    #  Ac4veSQL  
    sql  =  "select  *  from  cars  limit  :limit"  
    cars  =  Mapper.new(car)  
                                                     .query(sql,  limit:  10)  
    #  Ac4veRecord  
    cars[0].make  =  "Ferrari"  
    cars[0].save!  

    View Slide

  55. Job  scheduling
    • Background  jobs:  Sidekiq  (without  fibers  due  to  
    v8)  
    • Regular  jobs:  Discourse  Scheduler  
    • Lightweight  jobs:  Discourse  Defer,  runs  between  
    requests  

    View Slide

  56. Lightweight  jobs
    ObjectSpace.each_object(Unicorn::HttpServer)  do  |s|    
       s.extend(Scheduler::Defer::Unicorn)    
    end  
    module  Unicorn    
       def  process_client(client)    
           Defer.pause    
           super(client)    
           Defer.do_all_work    
           Defer.resume    
       end    
    end  

    View Slide

  57. Lightweight  jobs
    Scheduler::Defer.later  "track  view"  
    do  
       track_page_view(topic,  user)    
    end    

    View Slide

  58. Source  maps  in  producEon
    •  Using  uglifyjs  directly  is  significantly  faster,  6x  faster  on  some  files  
    •  GSOC  project  to  improve  this  
    •  uglifyjs  command  line  makes  it  simple  to  add  source  maps  
    •  Discourse  -­‐>  lib/tasks/assets.rake  
     
    def  compress_node(from,to)  
       to_path  =  "#{assets_path}/#{to}”  
       source_map_root  =  (d=File.dirname(from))  ==  "."  ?  "/assets"  :  "/assets/#{d}”  
       cmd  =  "uglifyjs  '#{assets_path}/#{from}'  -­‐p  relaHve  -­‐c  -­‐m  -­‐o  '#{to_path}'  -­‐-­‐source-­‐map-­‐root  
    '#{source_map_root}'  -­‐-­‐source-­‐map  '#{assets_path}/#{to}.map'  -­‐-­‐source-­‐map-­‐url  '/assets/#{to}.map’”  
       STDERR.puts  cmd  
       `#{cmd}  2>&1`  
    end  

    View Slide

  59. Anonymous  Cache
    • Dras4cally  improves  performance  
    • Before:  52  ms  per  request  to  home  page  
    • Auer:  1.6  ms  per  request  to  home  page  
    • Redis  backend  (redis.setex)  
    • lib/middleware/anonymous_cache.rb  

    View Slide

  60. Anonymous  Cache
    • Vast  majority  of  traffic  is  anonymous  
    • Caching  logic  is  tricky:  
    •  Is  it  mobile?  
    •  Is  it  a  web  crawler?  
    •  Is  user  logged  on?  
    • Implemented  as  Rack  middleware,  early  in  chain  

    View Slide

  61. OpEmizing  for  Development
    • rake  autospec  
    • beOer_errors  
    • rack-­‐mini-­‐profiler  
    • logster  
    • Fast  browser  reload  4mes  
    • Live  CSS  refresh  thanks  to  message_bus  

    View Slide

  62. rake  autospec
    • Like  guard  but…  
    • Interrupt  exis4ng  test  runs  
    • Focus  on  failure  
    • Demo  

    View Slide

  63. Faster  browser  reloads  in  Development
    915  files  

    View Slide

  64. View Slide

  65. Yay,  it  loads  fast!  
     
    L  One  big  giant  hard  to  debug  file  

    View Slide

  66. eval("var  hello=1;\ndebugger;\n//#  sourceURL=testing_sourceURL.js")  

    View Slide

  67. //#  sourceURL
    • Must  be  used  from  JavaScript  eval  
    • Supported  in  IE11  /  Firefox  and  Chrome  
    • Easily  backported  into  Asset  Pipeline  

    View Slide

  68. Awesome  fast  debugging  for  Rails
    Shout  out  to  Robin  Ward  for  building  this  

    View Slide

  69. sourceURL  performance
    • 900  js  assets  
    • First  Paint  1.4  seconds  (was  30  seconds)  
    • Dom  loaded  1.7  seconds  (was  60  seconds)    
    • High  latency  friendly  
    • Can  be  easily  added  to  Asset  Pipeline  

    View Slide

  70. Choose  your  own  adventure
    • Try  out  other  frameworks  and  
    gems  
    • Find  your  boOlenecks  
    • Have  fun  
    • Experiment  

    View Slide

  71. We  can  make  our  applica4ons  faster  

    View Slide

  72. We  can  make  our  frameworks    
    consume  less  memory  

    View Slide

  73. We  have  the  tooling  

    View Slide