"Ruby's Instrumentation Crisis" - NYC.rb 10/09/2012

Ruby’s Instrumentation Crisis Michael R. Bernstein NYC.rb 10/9/2012 @mrb_bk github.com/mrb
Wednesday, October 10, 12 - My name is Mike Bernstein - This is “Ruby’s Instrumentation Crisis” - Happy to be here at NYC.rb, learned from a lot of people in this room - I’m kind of “workshopping” this talk - some sections might seem a bit under-represented - I’m giving this talk at Rubyconf and would love to hear your thoughts

About Me Wednesday, October 10, 12 - My ﬁrst exposure
to Ruby was around ~2005 - I was a Comp-Scie teacher and was looking for a blogging platform, stumbled upon DHH Rails blog demo - Since then I’ve been using Ruby professionally for around 6 years - I love Ruby! I talk a lot of shit about it, but I love it

Wednesday, October 10, 12 - I work at Paperless Post
- I hack on Ruby there amongst other things, get to work on some hard problems - We’re awesome, and hiring

Raise Your Hands Wednesday, October 10, 12 - I just
want to get a feel for your experience a little bit - How many of you have instrumented VMs or proﬁled code in any language? - How many of you have proﬁled Ruby code? - How many of you have tried to instrument the Ruby VM? Dtrace, SystemTap etc. - How many of you have used a non MRI Ruby? - How many of you deploy Ruby apps to Linux? Which? - Okay, on with it then!

This Talk Is Inspired By Wednesday, October 10, 12 -
Large Programs

LARGE PROGR AMS Wednesday, October 10, 12 - What’s happening
under the hood? - For me, the ﬁrst large programs I needed to proﬁle were production Rails apps - I had experience optimizing code from time doing graphics and sound programming in grad school - I’ve had a taste of that kind of programming attitude - But that’s not what this talk is about

This talk is (mostly) about Ruby Wednesday, October 10, 12
- This talk is mostly about Ruby - A few different Rubies actually - I’m going to talk about a lot of different stuff, but basically I want people to think of Ruby more like

Wednesday, October 10, 12 - Instead of like - Back
and forth a bit

Wednesday, October 10, 12 - And with that

What is Instrumentation? Wednesday, October 10, 12 - For the
purposes of this talk, let’s deﬁne our terms - I’ll use instrumentation to refer to the tools used to collect data about running programs - “Instrumentation” technically refers to one of a few different means by which software developers have gathered information about their programs over the years, I’ll talk about the history of instrumentation in a minute.

Where did it come from? Wednesday, October 10, 12 -
What are the origins of proﬁling and instrumentation?

Around the same time computers got fast, programs got large.
Wednesday, October 10, 12

Around the same time programs got large, we started to
proﬁle them. Wednesday, October 10, 12 - From this you can infer that...

Around the same time computers got fast, programs got slow.
Wednesday, October 10, 12

A brief history, from profiling to instrumentation. Wednesday, October 10,
12 - It started out with counting execution of certain low-level instructions on earliest computers - profiling - Late 70s unix tool prof attempted to make profiling more convenient - profiling - 1982 the gprof paper introduced full call graph analysis, still profiling - 90s, ATOM, instrumentation - 2004 Dtrace introduced

Why is it important? Why did you use the word
‘crisis?’ Wednesday, October 10, 12 - So why is it important to instrument code? - Why am I trying to scare people with the word “crisis?” - I’ve been thinking about this for a long time, and after I submitted this talk to RubyConf and it was accepted, I went to Strange Loop and saw this

Title Wednesday, October 10, 12 - I know you can’t
read this or really see it, I’m a terrible photographer - This is Lars Bak who currently works at Google on V8 and Dart - Historically worked on many VMs inclusing Smalltalk VMs and Hotspot - A veteran of optimizing OO VMs - Here he’s talking about how measuring is how you “Go a lot faster” - My point is

Title You can trust Lars Bak Wednesday, October 10, 12

Measuring Things: It’s Important Wednesday, October 10, 12 - How
I like to state it is: MEASURING THINGS IS IMPORTANT

Wednesday, October 10, 12 - That’s what this guy would
say

So what are we measuring and how do we measure
it? Wednesday, October 10, 12 - On a lower level we’re measuring how are program interacts with the underlying system - How much memory is it allocating - How dependant on i/o is it? - On a higher level we’re measuring what Ruby code is being called, and where

Ruby VM OS Disk RAM Wednesday, October 10, 12 -
So what are we measuring in Ruby land? - Here’s a vast oversimpliﬁcation of what we’re dealing with - You have your Ruby VM, running on top of your OS, which handles access to your disks and to your memory

Ruby VM OS Disk RAM Heap Wednesday, October 10, 12
- Because that’s just a little too simple, remember that there’s also a chunk of memory that is managed by the Ruby VM directly - Ruby grabs memory from the OS in large chunks, because it is an expensive operation - acquiring memory in this way also has its downsides, but that is a little deeper than I want to go right now - These chunks form a segment of memory called the “Heap,” and it is where your Garbage Collectors do their work

Ruby VM OS Disk RAM Heap GC Wednesday, October 10,
12 - Let’s extend it just a little more to show that Garbage Collector - It’s not actually a separate process - Different Rubies have different implementations - We want to know what’s inside that Heap, because that’s what makes our programs slow - What’s inside that heap and when our GC runs boils down to when certain internal VM calls are being made - If we had access to these events, we’d have better insight into our VM

Some Guidelines for measuring tools Wednesday, October 10, 12 -
Now that we’ve thought about what aspects of Ruby we’d like to measure, let’s think about some guidelines for tools for instrumentation - Next slide is DTrace

Title Wednesday, October 10, 12 - A slide from the
ﬁrst “real” Sun internal presentation on DTrace - Lays out the attribues a modern tracing framework must have - Dtrace accompishes these by being a system that is deeply integrated with its host kernel - Works on Solaris and BSD, but not on the Linux systems that most of us deploy to

State of the art: DTrace Wednesday, October 10, 12 -
The previous list of attributes was paraphrased from the research that went into creating DTrace - Let’s look at a slide from the ﬁrst official internal DTrace presentation at Sun

Proﬁling in other languages Wednesday, October 10, 12 - So
how do other languages do it? - Let’s take a look at Smalltalk, Java, Erlang, and a few More

For the Spirit: Smalltalk Wednesday, October 10, 12 - Playful
- Execute everywhere - “Change the GC algorithm!”

Wednesday, October 10, 12 - A Smalltalk GUI that you
alter while using it - The kind of spirit I want people to bring to Ruby’s VM

For the Strength: Java Wednesday, October 10, 12 - Industrial
strength tools - Available and accessible - JVM Hackers - Javaists, Clojurians, Jrubyists

Wednesday, October 10, 12 - YourKit - Recommended by David
Nolen, who works on core.logic, an insanely amazing library for clojure, a JVM language - Will work on any JVM process out of the box - Started up elasticsearch, was rolling in seconds, can see Memory, Threads, GC, etc. - Inspiring, only one of many tools.

For the tools: Objective-C Wednesday, October 10, 12 - DTrace
is integrated into OSX - Instruments - A culture of measurement - Hardware hacking attitudes

Wednesday, October 10, 12 - My friend was like “oh
let me instrument Rdio”

Wednesday, October 10, 12 - “Damn it’s context switching like
crazy”

For the hell of it: Erlang, JS, Go Wednesday, October
10, 12 - Erlang - semantics for process control built into language, OTP framework is awesome, more performance aware as a community - JS - large JS applications running on V8 can be proﬁled pretty well - JSPerf and microbenches are not all bad, JavaScript programmers are thinking about performance! Node contributes to this as well. - Go - a modern runtime that supports pretty sophisticated insight into process internals, goroutines can report deadlocks, etc

And if you’re a Rubyist, you might not do it
at all. Wednesday, October 10, 12 - This is a central point I want to address in this talk. - It’s time to engender a community of measuring - Let’s be measurers and shippers at the same time

Wednesday, October 10, 12 - Presented without comment

Proﬁling Ruby Wednesday, October 10, 12 - Let’s check into
the state of proﬁling in Ruby land - Many different Ruby implementations - JRuby, Rubinus, Maglev, MRI

Leverage the JVM: JRuby Wednesday, October 10, 12 - You
can tell from the Java section that I’m jealous of their skills - Leverage the JVM’s tools - Access to JVM proﬁling points from Ruby code - If you’re willing to accept JVM operations, it’s a great option

On the shoulders: Rubinius Wednesday, October 10, 12 - Built
from scratch - I’m a huge fan of this project and its authors - The Agent fulﬁlls most if not all of the ideals that this talk has for Ruby VMs - You can read information about the generational GC, JIT, threads, and more

Read the manual: MagLev Wednesday, October 10, 12 - Built
on a smalltalk VM - A lot of unexplored territory - A lot of lessons we haven’t learned - MagLev gives you access to gemstone/s instrumentation points in Ruby

Stranded? MRI Wednesday, October 10, 12 - 1.8.7 had memprof
- Not many well maintained tools for 1.9.2 - A struggle to even get DTrace probes into Ruby 2.0 - Are we stranded?

Stranded? I think not. Wednesday, October 10, 12 - This
guy and I both agree that we’re not stranded. - We just need to regroup, and orient our community toward ideals for performance and measurement - We can contribute to Rubinus, fork MRI if we need to, liberally apply JRuby and MagLev where necessary, and ship it. - One thing we should be doing is considering what our ideal for proﬁling would be

An ideal for proﬁling Wednesday, October 10, 12 - What
would the perfect Ruby instrumentation and proﬁling tools look like? - DTrace is great but doesn’t cover a majority of Ruby deploys - Let’s get inside the VM

Must work in development AND in production. Wednesday, October 10,
12 - And staging, and CI, etc. - A tool during implementation and a tool for real diagnostics

What information do we want to see? Wednesday, October 10,
12 - Quick glance stats - # of Objects by Class - Ability to turn potentially impactful proﬁling “on” and “off” - Object Allocation, by line - How much memory are classes using - GC Information - timing, # of runs - Thread information - See is the key word here - Like a Boundary for the Ruby VM

Ruby’s instrumentation future Wednesday, October 10, 12 - What can
we learn from Dtrace? - Emphasize performance as much as clarity and testability - Very important - Hack MRI! Hack Rubinius! Hack JRuby! Hack Maglev! - Read papers - Let’s not be afraid of the code, let’s embrace it. - If we can’t navigate the politics, we can cement our own tools

Let’s steal everyone’s good ideas! Wednesday, October 10, 12 -
from Objective-C and OSX - great looking tooling - from Java - Standardized proﬁling points - from Erlang - community enforced standards for performance - and on and on

...and? Wednesday, October 10, 12 - Ship it!

:shipit: Wednesday, October 10, 12 - What really matters -
What you’ll be remembered by - How you can show that you care about Ruby

Credits: Ruby Drawings by Maya Miller Wednesday, October 10, 12

Credits: Awesome help from Aman Gupta, Patrick Thomson, Jesse Cooke,
Evan Phoenix, Brian Ford, Elise Huard, Brian Cantrill, Sean Cribbs, Brian Mitchell, Charles Nutter, Tony Arcieri Wednesday, October 10, 12

References: https://gist.github.com/3837455 Wednesday, October 10, 12

Thanks Wednesday, October 10, 12

"Ruby's Instrumentation Crisis" - NYC.rb 10/09/...

"Ruby's Instrumentation Crisis" - NYC.rb 10/09/2012

More Decks by Michael Bernstein

Other Decks in Programming

Featured

Transcript