Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Speed Science

Speed Science

Run your app faster, with less RAM and a quicker boot time today. How? With science! In this talk we'll focus on the process of turning performance problems into reproducible bugs we can understand and squash. We'll look at real world use cases where small changes resulted in huge real world performance gains. You'll walk away with concrete and actionable advice to improve the speed of your app, and with the tools to equip your app for a lifetime of speed. Live life in the fast lane, with science!

Richard Schneeman

April 21, 2015
Tweet

More Decks by Richard Schneeman

Other Decks in Programming

Transcript


  1. 2015
    yo
    Speed
    Science
    @schneems

    View Slide

  2. What is our
    goal?

    View Slide

  3. What is our
    goal?

    View Slide

  4. View Slide

  5. Slow
    is
    a
    bug

    View Slide

  6. Bugs can be
    reproduced

    View Slide

  7. Once
    reproduced
    we can
    squash

    View Slide

  8. What is this
    shape?

    View Slide

  9. View Slide

  10. Fast
    Angles

    View Slide

  11. View Slide

  12. Square
    Edges

    View Slide

  13. View Slide

  14. Square!

    View Slide

  15. Programming
    has a speed
    square

    View Slide

  16. IO
    RAM
    CPU

    View Slide

  17. Trade
    resources
    for speed

    View Slide

  18. Likewise

    View Slide

  19. Trade one
    resource for
    another

    View Slide

  20. How do we
    make things
    fast?

    View Slide

  21. Find the
    bottleneck

    View Slide

  22. View Slide

  23. We can use
    the scientific
    method!

    View Slide

  24. View Slide

  25. BTW

    View Slide

  26. They
    Call me
    @Schneems

    View Slide

  27. I
    Love
    Ruby

    View Slide

  28. View Slide

  29. Ruby
    Schneems

    View Slide

  30. Ruby
    Python

    View Slide

  31. Human.new

    View Slide

  32. Names?

    View Slide

  33. Girl?
    Boy?

    View Slide

  34. Sudo
    clean up your
    room

    View Slide

  35. View Slide

  36. Booth
    Koichi

    Terence

    View Slide

  37. Rails
    Commit
    Bit

    View Slide

  38. View Slide

  39. Once upon a
    time

    View Slide

  40. There was an
    app

    View Slide

  41. Downloads a
    large .tgz file

    View Slide

  42. Decompres
    File

    View Slide

  43. Fast?

    View Slide

  44. Pull
    Faster!

    View Slide

  45. download
    unzip
    process
    IO

    View Slide

  46. download
    unzip
    process
    IO
    IO
    CPU

    View Slide

  47. download
    unzip
    process
    IO
    IO
    CPU
    RAM
    CPU

    View Slide

  48. Hypothesis:
    Not Enough
    CPU

    View Slide

  49. Bring in the
    big guns

    View Slide

  50. PX DYNO

    View Slide

  51. 6 GB of RAM

    View Slide

  52. 1 PX Dyno
    ==
    1 AWS
    instance

    View Slide

  53. No Sharing

    View Slide

  54. More threads
    ==
    More CPU
    load

    View Slide

  55. Sidekiq
    worker

    View Slide

  56. config:set
    SIDEKIQ_CONCURRENCY
    =
    30

    View Slide

  57. download
    unzip
    process
    30x

    View Slide

  58. Result?

    View Slide

  59. Faster
    but…

    View Slide

  60. View Slide

  61. Theory:
    Not using all
    available
    CPU

    View Slide

  62. config:set
    SIDEKIQ_CONCURRENCY
    =
    30

    View Slide

  63. config:set
    SIDEKIQ_CONCURRENCY
    =
    30
    60

    View Slide

  64. Result?

    View Slide

  65. ER

    View Slide

  66. View Slide

  67. Why?

    View Slide

  68. Not

    CPU bound

    View Slide

  69. IO
    RAM
    CPU

    View Slide

  70. download
    unzip
    process
    IO
    IO
    CPU
    RAM
    CPU

    View Slide

  71. download
    unzip
    process
    IO
    IO
    CPU
    RAM
    CPU

    View Slide

  72. download
    unzip
    process
    IO
    IO
    Network & Disk

    View Slide

  73. download
    unzip
    process
    IO
    IO
    Network & Disk
    Disk

    View Slide

  74. Theory:
    Network is
    Slow

    View Slide

  75. Tried:
    Curl,
    Curb,
    Excon
    Typhoeus

    View Slide

  76. What is the
    solution to all
    slow things?

    View Slide

  77. GO

    View Slide

  78. View Slide

  79. Sorry GO is
    not the
    answer

    View Slide

  80. Network IO
    wasn’t the
    problem

    View Slide

  81. Maybe, it’s
    disk IO?

    View Slide

  82. config:set
    SIDEKIQ_CONCURRENCY
    =
    30

    View Slide

  83. config:set
    SIDEKIQ_CONCURRENCY
    =
    30
    4

    View Slide

  84. View Slide

  85. The Disk was
    at max read/
    write
    capacity

    View Slide

  86. Threads were
    sitting idle
    due to disk
    contention

    View Slide

  87. Ended up
    using 2x
    dynos instead

    View Slide

  88. Moral of the story:
    Blindly trying
    things works, but
    it’s not very
    scientific

    View Slide

  89. View Slide

  90. View Slide

  91. View Slide

  92. Good Science
    is repeatable

    View Slide

  93. View Slide

  94. Not
    Cold Fusion

    View Slide

  95. Not
    Repeatable

    View Slide

  96. Not
    Good Science

    View Slide

  97. Measure
    &
    Benchmark

    View Slide

  98. Heroku
    Added
    Metrics

    View Slide

  99. Speed
    Throughput
    CPU
    RAM

    View Slide

  100. High Level
    metrics =>
    Low level
    metrics

    View Slide

  101. Reproduce
    the slow

    View Slide

  102. RAM

    View Slide

  103. Ruby is
    typically
    RAM bound

    View Slide

  104. Ruby is a
    “managed”
    Language

    View Slide

  105. Don’t manually
    allocate, use
    Garbage
    Collector
    instead

    View Slide

  106. 2 causes for
    increased
    memory use

    View Slide

  107. Retained
    Objects
    1)

    View Slide

  108. RETAINED = []
    1.upto(100_000_000).each do |i|
    RETAINED << “#{i}”
    end

    View Slide

  109. 7.32 GB

    View Slide

  110. WHY?

    View Slide

  111. Memory

    View Slide

  112. Memory
    Object slots

    View Slide

  113. Memory
    “1” “2” “3”“4” “5” “6” “7”
    RETAINED = []
    1.upto(100_000_000).each do |i|
    RETAINED << “#{i}”
    end

    View Slide

  114. Out of slots!
    Allocate more
    memory

    View Slide

  115. Memory
    “1” “2” “3”“4” “5” “6” “7”
    “10”
    “8” “9” “11” “12” “13” “14”

    View Slide

  116. Allocated
    Objects
    1)

    View Slide

  117. 1.upto(100_000_000).each do |i|
    puts "#{i}"
    end

    View Slide

  118. 21.8 MB

    View Slide

  119. Memory
    “1” “2” “3”“4” “5” “6” “7”

    View Slide

  120. Out of slots!
    Run GC!

    View Slide

  121. Memory
    “2” “3”“4” “5” “6” “7”
    1.upto(100_000_000).each do |i|
    puts "#{i}"
    end
    “1”

    View Slide

  122. Memory
    “2” “3”“4” “5” “6” “7”
    1.upto(100_000_000).each do |i|
    puts "#{i}"
    end

    View Slide

  123. Memory
    “8” “3”“4” “5” “6” “7”
    1.upto(100_000_000).each do |i|
    puts "#{i}"
    end
    “2”

    View Slide

  124. GC will save
    us!

    View Slide

  125. def make_objects
    array = []
    1.upto(100_000_000).each do |i|
    array << "#{i}"
    end
    return nil
    end
    make_objects
    GC.start(full_mark: true,
    immediate_sweep: true)

    View Slide

  126. def make_objects
    array = []
    1.upto(100_000_000).each do |i|
    array << "#{i}"
    end
    return nil
    end
    make_objects
    GC.start(full_mark: true,
    immediate_sweep: true)

    View Slide

  127. def make_objects
    array = []
    1.upto(100_000_000).each do |i|
    array << "#{i}"
    end
    return nil
    end
    make_objects
    GC.start(full_mark: true,
    immediate_sweep: true)

    View Slide

  128. Result?

    View Slide

  129. 7.32 GB

    View Slide

  130. WAT

    View Slide

  131. GC.start(full_mark: true, immediate_sweep: true)
    GC.disable
    # Code goes here
    before = GC.stat[:total_freed_objects]
    GC.enable
    GC.start(full_mark: true, immediate_sweep: true)
    after = GC.stat[:total_freed_objects]
    # => “Objects cleared by GC: 100_010_687”

    View Slide

  132. Cannot open
    slots while
    reference
    exists

    View Slide

  133. Memory
    “1” “2” “3”“4” “5” “6” “7”
    “10”
    “8” “9” “11” “12” “13” “14”

    View Slide

  134. Run
    GC

    View Slide

  135. Memory
    “1” “2” “3”“4” “5” “6” “7”
    “10”
    “8” “9” “11” “12” “13” “14”

    View Slide

  136. Ruby
    NEVER
    releases
    memory

    View Slide

  137. Memory

    View Slide

  138. Test
    Everything
    (even these
    slides)

    View Slide

  139. View Slide

  140. How does this
    affect my
    Rails app?

    View Slide

  141. Retained
    App

    View Slide

  142. Retained
    Allocated
    App

    View Slide

  143. Mem
    Usage
    App

    View Slide

  144. Request
    Starts

    View Slide

  145. Request Full
    .

    View Slide

  146. Request Full
    Full
    ..

    View Slide

  147. Request Full
    Full
    Full
    ...

    View Slide

  148. Request Full
    Full
    Full
    Full
    ....

    View Slide

  149. Request Full
    Full
    Full
    Full
    Full
    ......

    View Slide

  150. Out of
    memory
    Full
    Full
    Full
    Full
    Full

    View Slide

  151. Run GC,
    All objs
    needed
    0_o
    Full
    Full
    Full
    Full
    Full

    View Slide

  152. Allocate
    mem
    Full
    Full
    Full
    Full
    Full

    View Slide

  153. Request
    Done
    Full
    Full
    Full
    Full
    Full
    ........

    View Slide

  154. Eventually

    View Slide

  155. GC
    Objs
    not used
    Full
    Full
    Full
    Full
    Full

    View Slide

  156. GC
    Objs
    not used

    View Slide

  157. Memory
    increases even
    without
    object retention

    View Slide

  158. Object creation/
    deletion
    takes
    RAM

    View Slide

  159. Object creation/
    deletion
    takes
    Time

    View Slide

  160. Further
    Reading:

    View Slide

  161. Generational
    GC

    View Slide

  162. I want to
    share a
    customer
    story

    View Slide

  163. One day
    a ticket came
    in

    View Slide

  164. My app is
    slow

    View Slide

  165. I checked
    metrics

    View Slide

  166. View Slide

  167. They were
    using too much
    RAM
    and hitting
    swap

    View Slide

  168. swap

    View Slide

  169. Why?

    View Slide

  170. “didn’t used
    to happen”

    View Slide

  171. “we didn’t
    change
    anything”

    View Slide

  172. Hypothesis
    #1

    View Slide

  173. RAM increase
    due to
    misbehaving
    code in a Gem

    View Slide

  174. How do we
    test?

    View Slide

  175. Boot app, hit
    with requests,
    profile
    memory

    View Slide

  176. Derailed
    Benchmarks

    View Slide

  177. Boot the app

    View Slide

  178. Process a
    request

    View Slide

  179. Benchmark
    with
    Memory
    Profiler

    View Slide

  180. $ rake perf:mem

    View Slide

  181. I did this in
    codetriage.com
    sent
    instructions to
    customer

    View Slide

  182. allocated memory by gem
    -----------------------------------
    rack-1.5.2 x 6491
    actionpack-4.1.2 x 4888
    hashie-3.3.1 x 4735 <==============
    activesupport-4.1.2 x 3292
    omniauth-1.2.2 x 1387
    actionview-4.1.2 x 1107
    ruby-2.1.2/lib x 1097
    railties-4.1.2 x 925
    activerecord-4.1.2 x 440
    warden-1.2.3 x 200
    codetriage-test-app/app x 40
    other x 40

    View Slide

  183. Hashie is
    incredibly
    expensive

    View Slide

  184. Lots of un-
    needed
    objects

    View Slide

  185. Omniauth
    Uses
    Hashie

    View Slide

  186. Mitigated in
    Omniauth
    #774

    View Slide

  187. App still over
    RAM limit

    View Slide

  188. Hypothesis
    #2

    View Slide

  189. Bad gems
    using too
    much RAM at
    require time

    View Slide

  190. How do we
    test?

    View Slide

  191. Customer
    sends me their
    Gemfile &
    Gemfile.lock

    View Slide

  192. There’s no
    tooling for
    benchmarking
    require memory
    use

    View Slide

  193. Let’s write
    one!

    View Slide

  194. Concept:
    measure RAM
    before
    and after
    require

    View Slide

  195. Use
    GetProcessMem.new
    to measure RSS

    View Slide

  196. Resident
    Set
    Size

    View Slide

  197. A close-ish
    approximation
    for memory use

    View Slide

  198. Doesn’t take
    into account
    shared
    memory

    View Slide

  199. Monkey
    Patch
    Kernel.require

    View Slide

  200. View Slide

  201. active_record/connection_adapters/postgresql_adapter: 2.2734375
    active_record/connection_adapters/postgresql_adapter: 2.2734 mb
    active_record/connection_adapters/postgresql/oid: 0.8555 mb
    will_paginate/active_record: 1.4921875
    will_paginate/active_record: 1.4922 mb
    active_support/all: 1.01171875
    active_support/all: 1.0117 mb
    active_support/core_ext: 0.9727 mb
    rack/test: 0.875
    rack/test: 0.875 mb
    /Users/richardschneeman/Documents/projects/codetriage/app/models/user: 0.734375
    /Users/richardschneeman/Documents/projects/codetriage/app/models/user: 0.7344 mb
    /Users/richardschneeman/Documents/projects/codetriage/app/models/repo: 0.50390625
    /Users/richardschneeman/Documents/projects/codetriage/app/models/repo: 0.5039 mb
    new_relic/control/frameworks/rails4: 0.42578125
    new_relic/control/frameworks/rails4: 0.4258 mb
    new_relic/control/frameworks/rails3: 0.3867 mb
    benchmark/ips: 0.4140625
    benchmark/ips: 0.4141 mb
    will_paginate/view_helpers/action_view: 0.32421875
    will_paginate/view_helpers/action_view: 0.3242 mb
    /Users/richardschneeman/Documents/projects/codetriage/app/models/issue: 0.31640625
    /Users/richardschneeman/Documents/projects/codetriage/app/models/issue: 0.3164 mb
    /Users/richardschneeman/Documents/projects/codetriage/app/models/repo_subscription: 0.2734375
    /Users/richardschneeman/.gem/ruby/2.2.1/gems/devise-3.4.1/app/mailers/devise/mailer: 0.265625
    sprockets/base: 0.23046875
    active_support/cache/dalli_store: 0.22265625
    /Users/richardschneeman/Documents/projects/codetriage/app/models/issue_assignment: 0.18359375
    /Users/richardschneeman/Documents/projects/codetriage/app/controllers/users/after_signup_controller:
    0.18359375
    /Users/richardschneeman/.gem/ruby/2.2.1/gems/wicked-1.0.0/app/controllers/wicked/wizard_controller:
    0.1796875
    /Users/richardschneeman/Documents/projects/codetriage/app/mailers/user_mailer: 0.1796875
    /Users/richardschneeman/.gem/ruby/2.2.1/gems/devise-3.4.1/app/controllers/devise/
    confirmations_controller: 0.17578125
    /Users/richardschneeman/Documents/projects/codetriage/app/controllers/repos_controller: 0.17578125

    View Slide

  202. active_record/connection_adapters/postgresql_adapter: 2.2734375
    active_record/connection_adapters/postgresql_adapter: 2.2734 mb
    active_record/connection_adapters/postgresql/oid: 0.8555 mb
    will_paginate/active_record: 1.4921875
    will_paginate/active_record: 1.4922 mb
    active_support/all: 1.01171875
    active_support/all: 1.0117 mb
    active_support/core_ext: 0.9727 mb
    rack/test: 0.875
    rack/test: 0.875 mb
    /Users/richardschneeman/Documents/projects/codetriage/app/models/user: 0.734375
    /Users/richardschneeman/Documents/projects/codetriage/app/models/user: 0.7344 mb
    /Users/richardschneeman/Documents/projects/codetriage/app/models/repo: 0.50390625
    /Users/richardschneeman/Documents/projects/codetriage/app/models/repo: 0.5039 mb
    new_relic/control/frameworks/rails4: 0.42578125
    new_relic/control/frameworks/rails4: 0.4258 mb
    new_relic/control/frameworks/rails3: 0.3867 mb
    benchmark/ips: 0.4140625
    benchmark/ips: 0.4141 mb
    will_paginate/view_helpers/action_view: 0.32421875
    will_paginate/view_helpers/action_view: 0.3242 mb
    /Users/richardschneeman/Documents/projects/codetriage/app/models/issue: 0.31640625
    /Users/richardschneeman/Documents/projects/codetriage/app/models/issue: 0.3164 mb
    /Users/richardschneeman/Documents/projects/codetriage/app/models/repo_subscription: 0.2734375
    /Users/richardschneeman/.gem/ruby/2.2.1/gems/devise-3.4.1/app/mailers/devise/mailer: 0.265625
    sprockets/base: 0.23046875
    active_support/cache/dalli_store: 0.22265625
    /Users/richardschneeman/Documents/projects/codetriage/app/models/issue_assignment: 0.18359375
    /Users/richardschneeman/Documents/projects/codetriage/app/controllers/users/after_signup_controller:
    0.18359375
    /Users/richardschneeman/.gem/ruby/2.2.1/gems/wicked-1.0.0/app/controllers/wicked/wizard_controller:
    0.1796875
    /Users/richardschneeman/Documents/projects/codetriage/app/mailers/user_mailer: 0.1796875
    /Users/richardschneeman/.gem/ruby/2.2.1/gems/devise-3.4.1/app/controllers/devise/
    confirmations_controller: 0.17578125
    /Users/richardschneeman/Documents/projects/codetriage/app/controllers/repos_controller: 0.17578125

    View Slide

  203. Now we
    know what is
    using
    memory!

    View Slide

  204. mail: 39.2734 mb

    View Slide

  205. 1 Gem
    accounts for
    65% of total
    RAM at boot

    View Slide

  206. Remember:
    We haven’t
    even done
    any work

    View Slide

  207. Dig deeper

    View Slide

  208. Requiring a
    gem requires
    other files.

    View Slide

  209. We need more
    visibility

    View Slide

  210. We need a
    tree

    View Slide

  211. Each node
    can have
    many
    children

    View Slide

  212. Each child
    has a cost

    View Slide

  213. Mail
    mime/types
    mime/type
    mime json
    mail/message
    YAML
    net/smtp
    digest/
    md5
    timeout net/
    protocol

    View Slide

  214. View Slide

  215. Sort the
    children by
    cost & print
    recursively

    View Slide

  216. View Slide

  217. Final Code

    View Slide

  218. View Slide

  219. View Slide

  220. View Slide

  221. View Slide

  222. View Slide

  223. View Slide

  224. View Slide

  225. View Slide

  226. View Slide

  227. Final Result

    View Slide

  228. application: 60.8242 mb
    mail: 39.2734 mb
    mail/parsers: 19.2461 mb
    mail/parsers/ragel: 18.7031 mb
    mail/parsers/ragel/ruby: 18.6797 mb
    mail/parsers/ragel/ruby/machines/address_lists_machine: 7.2734 mb
    mail/parsers/ragel/ruby/machines/received_machine: 4.7578 mb
    mail/parsers/ragel/ruby/machines/envelope_from_machine: 2.2305 mb
    mail/parsers/ragel/ruby/machines/message_ids_machine: 1.5625 mb
    mail/parsers/ragel/ruby/machines/date_time_machine: 0.5977 mb
    mail/parsers/ragel/ruby/machines/content_disposition_machine: 0.4961 mb
    mail/parsers/ragel/ruby/machines/content_type_machine: 0.4648 mb
    mail/parsers/ragel/ruby/machines/content_location_machine: 0.3359 mb
    mail/parsers/ragel/ruby/machines/content_transfer_encoding_machine: 0.3281 mb
    mail/parsers/ragel/ruby/machines/phrase_lists_machine: 0.3086 mb
    mime/types: 18.4922 mb
    mail/field: 0.3125 mb

    View Slide

  229. application: 60.8242 mb
    mail: 39.2734 mb
    mail/parsers: 19.2461 mb
    mail/parsers/ragel: 18.7031 mb
    mail/parsers/ragel/ruby: 18.6797 mb
    mail/parsers/ragel/ruby/machines/address_lists_machine: 7.2734 mb
    mail/parsers/ragel/ruby/machines/received_machine: 4.7578 mb
    mail/parsers/ragel/ruby/machines/envelope_from_machine: 2.2305 mb
    mail/parsers/ragel/ruby/machines/message_ids_machine: 1.5625 mb
    mail/parsers/ragel/ruby/machines/date_time_machine: 0.5977 mb
    mail/parsers/ragel/ruby/machines/content_disposition_machine: 0.4961 mb
    mail/parsers/ragel/ruby/machines/content_type_machine: 0.4648 mb
    mail/parsers/ragel/ruby/machines/content_location_machine: 0.3359 mb
    mail/parsers/ragel/ruby/machines/content_transfer_encoding_machine: 0.3281 mb
    mail/parsers/ragel/ruby/machines/phrase_lists_machine: 0.3086 mb
    mime/types: 18.4922 mb
    mail/field: 0.3125 mb

    View Slide

  230. Opened
    an
    Issue

    View Slide

  231. Fixed in
    mikel/mail
    #817

    View Slide

  232. Customer’s
    App RAM use
    drops
    dramatically

    View Slide

  233. No longer
    Swapping

    View Slide

  234. Went from

    View Slide

  235. View Slide

  236. To

    View Slide

  237. View Slide

  238. Reproduce
    the slowness
    bug

    View Slide

  239. Get visibility
    with
    benchmarks

    View Slide

  240. Use
    Science!

    View Slide

  241. Have you
    ever made a
    bad deploy on
    a Friday?

    View Slide

  242. You try to fix
    it but…

    View Slide

  243. View Slide

  244. Your
    weekend

    View Slide

  245. Co-workers
    weekend

    View Slide

  246. Support’s
    weekend

    View Slide

  247. Isn’t there a
    better way?

    View Slide

  248. My proposal:

    View Slide

  249. Don’t
    Deploy
    on

    Friday

    View Slide

  250. Instead

    View Slide

  251. Funday
    Friday

    View Slide

  252. Is your app
    slow?

    View Slide

  253. Funday
    Friday

    View Slide

  254. Is your app
    disorganized
    ???

    View Slide

  255. Funday
    Friday

    View Slide

  256. Do you wish
    you had more
    time to
    refactor?

    View Slide

  257. Funday
    Friday

    View Slide

  258. Wish

    sponsored
    Open Source?

    View Slide

  259. Guess what?

    View Slide

  260. Your
    company just
    sponsored
    you!

    View Slide

  261. Funday
    Friday

    View Slide

  262. Not sold?

    View Slide

  263. More
    accurate
    deadlines

    View Slide

  264. Less burnout

    View Slide

  265. Does your boss
    want to hire
    more senior
    developers?

    View Slide

  266. Like the kind that
    refactors, makes
    speed
    improvements, and
    knows library
    internals?

    View Slide

  267. Train up
    senior
    developers
    instead!

    View Slide

  268. Funday
    Friday
    Funday
    Friday

    View Slide

  269. Report on
    your progress
    to your team

    View Slide

  270. Report to your
    boss

    View Slide

  271. Report to
    twitter
    #fundayfriday

    View Slide

  272. Get
    ❤️ ❤️ ❤️

    View Slide

  273. Give
    ❤️ ❤️ ❤️

    View Slide

  274. Did science
    build the
    pyramids?

    View Slide

  275. Coincidence?

    View Slide

  276. View Slide