Speed Science

Speed Science

Run your app faster, with less RAM and a quicker boot time today. How? With science! In this talk we'll focus on the process of turning performance problems into reproducible bugs we can understand and squash. We'll look at real world use cases where small changes resulted in huge real world performance gains. You'll walk away with concrete and actionable advice to improve the speed of your app, and with the tools to equip your app for a lifetime of speed. Live life in the fast lane, with science!

Db953d125f5cc49756edb6149f1b813e?s=128

Richard Schneeman

April 21, 2015
Tweet

Transcript

  1. 2015 yo Speed Science @schneems

  2. What is our goal?

  3. What is our goal?

  4. None
  5. Slow is a bug

  6. Bugs can be reproduced

  7. Once reproduced we can squash

  8. What is this shape?

  9. None
  10. Fast Angles

  11. None
  12. Square Edges

  13. None
  14. Square!

  15. Programming has a speed square

  16. IO RAM CPU

  17. Trade resources for speed

  18. Likewise

  19. Trade one resource for another

  20. How do we make things fast?

  21. Find the bottleneck

  22. None
  23. We can use the scientific method!

  24. None
  25. BTW

  26. They Call me @Schneems

  27. I Love Ruby

  28. None
  29. Ruby Schneems

  30. Ruby Python

  31. Human.new

  32. Names?

  33. Girl? Boy?

  34. Sudo clean up your room

  35. None
  36. Booth Koichi
 Terence

  37. Rails Commit Bit

  38. None
  39. Once upon a time

  40. There was an app

  41. Downloads a large .tgz file

  42. Decompres File

  43. Fast?

  44. Pull Faster!

  45. download unzip process IO

  46. download unzip process IO IO CPU

  47. download unzip process IO IO CPU RAM CPU

  48. Hypothesis: Not Enough CPU

  49. Bring in the big guns

  50. PX DYNO

  51. 6 GB of RAM

  52. 1 PX Dyno == 1 AWS instance

  53. No Sharing

  54. More threads == More CPU load

  55. Sidekiq worker

  56. config:set SIDEKIQ_CONCURRENCY = 30

  57. download unzip process 30x

  58. Result?

  59. Faster but…

  60. None
  61. Theory: Not using all available CPU

  62. config:set SIDEKIQ_CONCURRENCY = 30

  63. config:set SIDEKIQ_CONCURRENCY = 30 60

  64. Result?

  65. ER

  66. None
  67. Why?

  68. Not
 CPU bound

  69. IO RAM CPU

  70. download unzip process IO IO CPU RAM CPU

  71. download unzip process IO IO CPU RAM CPU

  72. download unzip process IO IO Network & Disk

  73. download unzip process IO IO Network & Disk Disk

  74. Theory: Network is Slow

  75. Tried: Curl, Curb, Excon Typhoeus

  76. What is the solution to all slow things?

  77. GO

  78. None
  79. Sorry GO is not the answer

  80. Network IO wasn’t the problem

  81. Maybe, it’s disk IO?

  82. config:set SIDEKIQ_CONCURRENCY = 30

  83. config:set SIDEKIQ_CONCURRENCY = 30 4

  84. None
  85. The Disk was at max read/ write capacity

  86. Threads were sitting idle due to disk contention

  87. Ended up using 2x dynos instead

  88. Moral of the story: Blindly trying things works, but it’s

    not very scientific
  89. None
  90. None
  91. None
  92. Good Science is repeatable

  93. None
  94. Not Cold Fusion

  95. Not Repeatable

  96. Not Good Science

  97. Measure & Benchmark

  98. Heroku Added Metrics

  99. Speed Throughput CPU RAM

  100. High Level metrics => Low level metrics

  101. Reproduce the slow

  102. RAM

  103. Ruby is typically RAM bound

  104. Ruby is a “managed” Language

  105. Don’t manually allocate, use Garbage Collector instead

  106. 2 causes for increased memory use

  107. Retained Objects 1)

  108. RETAINED = [] 1.upto(100_000_000).each do |i| RETAINED << “#{i}” end

  109. 7.32 GB

  110. WHY?

  111. Memory

  112. Memory Object slots

  113. Memory “1” “2” “3”“4” “5” “6” “7” RETAINED = []

    1.upto(100_000_000).each do |i| RETAINED << “#{i}” end
  114. Out of slots! Allocate more memory

  115. Memory “1” “2” “3”“4” “5” “6” “7” “10” “8” “9”

    “11” “12” “13” “14”
  116. Allocated Objects 1)

  117. 1.upto(100_000_000).each do |i| puts "#{i}" end

  118. 21.8 MB

  119. Memory “1” “2” “3”“4” “5” “6” “7”

  120. Out of slots! Run GC!

  121. Memory “2” “3”“4” “5” “6” “7” 1.upto(100_000_000).each do |i| puts

    "#{i}" end “1”
  122. Memory “2” “3”“4” “5” “6” “7” 1.upto(100_000_000).each do |i| puts

    "#{i}" end
  123. Memory “8” “3”“4” “5” “6” “7” 1.upto(100_000_000).each do |i| puts

    "#{i}" end “2”
  124. GC will save us!

  125. def make_objects array = [] 1.upto(100_000_000).each do |i| array <<

    "#{i}" end return nil end make_objects GC.start(full_mark: true, immediate_sweep: true)
  126. def make_objects array = [] 1.upto(100_000_000).each do |i| array <<

    "#{i}" end return nil end make_objects GC.start(full_mark: true, immediate_sweep: true)
  127. def make_objects array = [] 1.upto(100_000_000).each do |i| array <<

    "#{i}" end return nil end make_objects GC.start(full_mark: true, immediate_sweep: true)
  128. Result?

  129. 7.32 GB

  130. WAT

  131. GC.start(full_mark: true, immediate_sweep: true) GC.disable # Code goes here before

    = GC.stat[:total_freed_objects] GC.enable GC.start(full_mark: true, immediate_sweep: true) after = GC.stat[:total_freed_objects] # => “Objects cleared by GC: 100_010_687”
  132. Cannot open slots while reference exists

  133. Memory “1” “2” “3”“4” “5” “6” “7” “10” “8” “9”

    “11” “12” “13” “14”
  134. Run GC

  135. Memory “1” “2” “3”“4” “5” “6” “7” “10” “8” “9”

    “11” “12” “13” “14”
  136. Ruby NEVER releases memory

  137. Memory

  138. Test Everything (even these slides)

  139. None
  140. How does this affect my Rails app?

  141. Retained App

  142. Retained Allocated App

  143. Mem Usage App

  144. Request Starts

  145. Request Full .

  146. Request Full Full ..

  147. Request Full Full Full ...

  148. Request Full Full Full Full ....

  149. Request Full Full Full Full Full ......

  150. Out of memory Full Full Full Full Full

  151. Run GC, All objs needed 0_o Full Full Full Full

    Full
  152. Allocate mem Full Full Full Full Full

  153. Request Done Full Full Full Full Full ........

  154. Eventually

  155. GC Objs not used Full Full Full Full Full

  156. GC Objs not used

  157. Memory increases even without object retention

  158. Object creation/ deletion takes RAM

  159. Object creation/ deletion takes Time

  160. Further Reading:

  161. Generational GC

  162. I want to share a customer story

  163. One day a ticket came in

  164. My app is slow

  165. I checked metrics

  166. None
  167. They were using too much RAM and hitting swap

  168. swap

  169. Why?

  170. “didn’t used to happen”

  171. “we didn’t change anything”

  172. Hypothesis #1

  173. RAM increase due to misbehaving code in a Gem

  174. How do we test?

  175. Boot app, hit with requests, profile memory

  176. Derailed Benchmarks

  177. Boot the app

  178. Process a request

  179. Benchmark with Memory Profiler

  180. $ rake perf:mem

  181. I did this in codetriage.com sent instructions to customer

  182. allocated memory by gem ----------------------------------- rack-1.5.2 x 6491 actionpack-4.1.2 x

    4888 hashie-3.3.1 x 4735 <============== activesupport-4.1.2 x 3292 omniauth-1.2.2 x 1387 actionview-4.1.2 x 1107 ruby-2.1.2/lib x 1097 railties-4.1.2 x 925 activerecord-4.1.2 x 440 warden-1.2.3 x 200 codetriage-test-app/app x 40 other x 40
  183. Hashie is incredibly expensive

  184. Lots of un- needed objects

  185. Omniauth Uses Hashie

  186. Mitigated in Omniauth #774

  187. App still over RAM limit

  188. Hypothesis #2

  189. Bad gems using too much RAM at require time

  190. How do we test?

  191. Customer sends me their Gemfile & Gemfile.lock

  192. There’s no tooling for benchmarking require memory use

  193. Let’s write one!

  194. Concept: measure RAM before and after require

  195. Use GetProcessMem.new to measure RSS

  196. Resident Set Size

  197. A close-ish approximation for memory use

  198. Doesn’t take into account shared memory

  199. Monkey Patch Kernel.require

  200. None
  201. active_record/connection_adapters/postgresql_adapter: 2.2734375 active_record/connection_adapters/postgresql_adapter: 2.2734 mb active_record/connection_adapters/postgresql/oid: 0.8555 mb will_paginate/active_record: 1.4921875

    will_paginate/active_record: 1.4922 mb active_support/all: 1.01171875 active_support/all: 1.0117 mb active_support/core_ext: 0.9727 mb rack/test: 0.875 rack/test: 0.875 mb /Users/richardschneeman/Documents/projects/codetriage/app/models/user: 0.734375 /Users/richardschneeman/Documents/projects/codetriage/app/models/user: 0.7344 mb /Users/richardschneeman/Documents/projects/codetriage/app/models/repo: 0.50390625 /Users/richardschneeman/Documents/projects/codetriage/app/models/repo: 0.5039 mb new_relic/control/frameworks/rails4: 0.42578125 new_relic/control/frameworks/rails4: 0.4258 mb new_relic/control/frameworks/rails3: 0.3867 mb benchmark/ips: 0.4140625 benchmark/ips: 0.4141 mb will_paginate/view_helpers/action_view: 0.32421875 will_paginate/view_helpers/action_view: 0.3242 mb /Users/richardschneeman/Documents/projects/codetriage/app/models/issue: 0.31640625 /Users/richardschneeman/Documents/projects/codetriage/app/models/issue: 0.3164 mb /Users/richardschneeman/Documents/projects/codetriage/app/models/repo_subscription: 0.2734375 /Users/richardschneeman/.gem/ruby/2.2.1/gems/devise-3.4.1/app/mailers/devise/mailer: 0.265625 sprockets/base: 0.23046875 active_support/cache/dalli_store: 0.22265625 /Users/richardschneeman/Documents/projects/codetriage/app/models/issue_assignment: 0.18359375 /Users/richardschneeman/Documents/projects/codetriage/app/controllers/users/after_signup_controller: 0.18359375 /Users/richardschneeman/.gem/ruby/2.2.1/gems/wicked-1.0.0/app/controllers/wicked/wizard_controller: 0.1796875 /Users/richardschneeman/Documents/projects/codetriage/app/mailers/user_mailer: 0.1796875 /Users/richardschneeman/.gem/ruby/2.2.1/gems/devise-3.4.1/app/controllers/devise/ confirmations_controller: 0.17578125 /Users/richardschneeman/Documents/projects/codetriage/app/controllers/repos_controller: 0.17578125
  202. active_record/connection_adapters/postgresql_adapter: 2.2734375 active_record/connection_adapters/postgresql_adapter: 2.2734 mb active_record/connection_adapters/postgresql/oid: 0.8555 mb will_paginate/active_record: 1.4921875

    will_paginate/active_record: 1.4922 mb active_support/all: 1.01171875 active_support/all: 1.0117 mb active_support/core_ext: 0.9727 mb rack/test: 0.875 rack/test: 0.875 mb /Users/richardschneeman/Documents/projects/codetriage/app/models/user: 0.734375 /Users/richardschneeman/Documents/projects/codetriage/app/models/user: 0.7344 mb /Users/richardschneeman/Documents/projects/codetriage/app/models/repo: 0.50390625 /Users/richardschneeman/Documents/projects/codetriage/app/models/repo: 0.5039 mb new_relic/control/frameworks/rails4: 0.42578125 new_relic/control/frameworks/rails4: 0.4258 mb new_relic/control/frameworks/rails3: 0.3867 mb benchmark/ips: 0.4140625 benchmark/ips: 0.4141 mb will_paginate/view_helpers/action_view: 0.32421875 will_paginate/view_helpers/action_view: 0.3242 mb /Users/richardschneeman/Documents/projects/codetriage/app/models/issue: 0.31640625 /Users/richardschneeman/Documents/projects/codetriage/app/models/issue: 0.3164 mb /Users/richardschneeman/Documents/projects/codetriage/app/models/repo_subscription: 0.2734375 /Users/richardschneeman/.gem/ruby/2.2.1/gems/devise-3.4.1/app/mailers/devise/mailer: 0.265625 sprockets/base: 0.23046875 active_support/cache/dalli_store: 0.22265625 /Users/richardschneeman/Documents/projects/codetriage/app/models/issue_assignment: 0.18359375 /Users/richardschneeman/Documents/projects/codetriage/app/controllers/users/after_signup_controller: 0.18359375 /Users/richardschneeman/.gem/ruby/2.2.1/gems/wicked-1.0.0/app/controllers/wicked/wizard_controller: 0.1796875 /Users/richardschneeman/Documents/projects/codetriage/app/mailers/user_mailer: 0.1796875 /Users/richardschneeman/.gem/ruby/2.2.1/gems/devise-3.4.1/app/controllers/devise/ confirmations_controller: 0.17578125 /Users/richardschneeman/Documents/projects/codetriage/app/controllers/repos_controller: 0.17578125
  203. Now we know what is using memory!

  204. mail: 39.2734 mb

  205. 1 Gem accounts for 65% of total RAM at boot

  206. Remember: We haven’t even done any work

  207. Dig deeper

  208. Requiring a gem requires other files.

  209. We need more visibility

  210. We need a tree

  211. Each node can have many children

  212. Each child has a cost

  213. Mail mime/types mime/type mime json mail/message YAML net/smtp digest/ md5

    timeout net/ protocol
  214. None
  215. Sort the children by cost & print recursively

  216. None
  217. Final Code

  218. None
  219. None
  220. None
  221. None
  222. None
  223. None
  224. None
  225. None
  226. None
  227. Final Result

  228. application: 60.8242 mb mail: 39.2734 mb mail/parsers: 19.2461 mb mail/parsers/ragel:

    18.7031 mb mail/parsers/ragel/ruby: 18.6797 mb mail/parsers/ragel/ruby/machines/address_lists_machine: 7.2734 mb mail/parsers/ragel/ruby/machines/received_machine: 4.7578 mb mail/parsers/ragel/ruby/machines/envelope_from_machine: 2.2305 mb mail/parsers/ragel/ruby/machines/message_ids_machine: 1.5625 mb mail/parsers/ragel/ruby/machines/date_time_machine: 0.5977 mb mail/parsers/ragel/ruby/machines/content_disposition_machine: 0.4961 mb mail/parsers/ragel/ruby/machines/content_type_machine: 0.4648 mb mail/parsers/ragel/ruby/machines/content_location_machine: 0.3359 mb mail/parsers/ragel/ruby/machines/content_transfer_encoding_machine: 0.3281 mb mail/parsers/ragel/ruby/machines/phrase_lists_machine: 0.3086 mb mime/types: 18.4922 mb mail/field: 0.3125 mb
  229. application: 60.8242 mb mail: 39.2734 mb mail/parsers: 19.2461 mb mail/parsers/ragel:

    18.7031 mb mail/parsers/ragel/ruby: 18.6797 mb mail/parsers/ragel/ruby/machines/address_lists_machine: 7.2734 mb mail/parsers/ragel/ruby/machines/received_machine: 4.7578 mb mail/parsers/ragel/ruby/machines/envelope_from_machine: 2.2305 mb mail/parsers/ragel/ruby/machines/message_ids_machine: 1.5625 mb mail/parsers/ragel/ruby/machines/date_time_machine: 0.5977 mb mail/parsers/ragel/ruby/machines/content_disposition_machine: 0.4961 mb mail/parsers/ragel/ruby/machines/content_type_machine: 0.4648 mb mail/parsers/ragel/ruby/machines/content_location_machine: 0.3359 mb mail/parsers/ragel/ruby/machines/content_transfer_encoding_machine: 0.3281 mb mail/parsers/ragel/ruby/machines/phrase_lists_machine: 0.3086 mb mime/types: 18.4922 mb mail/field: 0.3125 mb
  230. Opened an Issue

  231. Fixed in mikel/mail #817

  232. Customer’s App RAM use drops dramatically

  233. No longer Swapping

  234. Went from

  235. None
  236. To

  237. None
  238. Reproduce the slowness bug

  239. Get visibility with benchmarks

  240. Use Science!

  241. Have you ever made a bad deploy on a Friday?

  242. You try to fix it but…

  243. None
  244. Your weekend

  245. Co-workers weekend

  246. Support’s weekend

  247. Isn’t there a better way?

  248. My proposal:

  249. Don’t Deploy on
 Friday

  250. Instead

  251. Funday Friday

  252. Is your app slow?

  253. Funday Friday

  254. Is your app disorganized ???

  255. Funday Friday

  256. Do you wish you had more time to refactor?

  257. Funday Friday

  258. Wish <company> sponsored Open Source?

  259. Guess what?

  260. Your company just sponsored you!

  261. Funday Friday

  262. Not sold?

  263. More accurate deadlines

  264. Less burnout

  265. Does your boss want to hire more senior developers?

  266. Like the kind that refactors, makes speed improvements, and knows

    library internals?
  267. Train up senior developers instead!

  268. Funday Friday Funday Friday

  269. Report on your progress to your team

  270. Report to your boss

  271. Report to twitter #fundayfriday

  272. Get ❤️ ❤️ ❤️

  273. Give ❤️ ❤️ ❤️

  274. Did science build the pyramids?

  275. Coincidence?

  276. None