How to Build a Smart Profiler for Rails

How to Build a Smart Profiler for Rails

Customers love fast apps. We've got tools to help us figure out how our Rails apps are performing on our development machines, but that's not what matters. It's critical that we also be able to measure how our apps are actually performing in the production environment.

That's the reason we built Skylight, and in this talk, we'll show you how we did it. We'll discuss the underlying principles Skylight uses to profile your apps across multiple production servers, and take a look at our tech stack, including Storm, Kafka, Cassandra, Rails, and Ember.js.

428167a3ec72235ba971162924492609?s=128

Yehuda Katz

April 23, 2014
Tweet

Transcript

  1. 1.
  2. 2.
  3. 4.

    $$$

  4. 6.

    What if we could take advantage of our deep knowledge

    of Rails and pair it with our expertise in building computation systems at scale? You can build something that no one else has done before. ! We didn't just want to build something a little better than our competitors. We wanted to build something that was light years ahead.
  5. 7.

    3 BREAKTHROUGHS As we were building Skylight, there were three

    critical breakthroughs that allowed us to deliver an application that was a quantum leap past existing solutions.
  6. 9.
  7. 10.

    “Our average response time for Basecamp right now is 87ms…

    That sounds fantastic, doesn’t it? And it easily leads you to believe that all is well and that we wouldn’t need to spend any more time optimizing performance.”
  8. 11.

    “Wrong. That average number is completely skewed by tons of

    super fast responses to feed requests and other cached replies. If you have 1000 requests that return in 5ms, then you can have 200 requests taking 2,000 ms and still get a respectable 170ms average. Useless.”
  9. 13.
  10. 16.

    95TH PERCENTILE This is the number that matters, because it

    doesn't capture the average response time. It's more like the average worst response time.
  11. 17.

    1 USER ≠ 1 REQUEST Users generate many requests. Having

    one out of every 20 take multiple seconds is an opportunity for them to leave.
  12. 24.

    ENDPOINT REQUESTS RESP UsersController#show 2934 203 UPDATE response_times SET requests=requests+1,

    resp=((resp*requests) +new_resp)/requests+1) EACH REQUEST Calculating average is super easy. That's why everyone does it.
  13. 33.

    CUTTING EDGE ALGORITHMS & DATA STRUCTURES Carl spent nearly a

    year reading the latest academic literature.
  14. 36.
  15. 38.
  16. 39.

    LIGHTNING FAST UI 3 We stream as much data as

    possible into your browser, so sorting and sifting through data is incredibly fast.
  17. 41.
  18. 42.
  19. 43.
  20. 49.
  21. 50.
  22. 52.

    CPU PROFILING Uses Ruby 2.1's new static sampling API so

    it's really fast. We can sample the stack without doing any allocations, keeping it super lightweight and low- impact on memory