Slide 1

Slide 1 text

Ruby’s Instrumentation Crisis Michael R. Bernstein NYC.rb 10/9/2012 @mrb_bk github.com/mrb Wednesday, October 10, 12 - My name is Mike Bernstein - This is “Ruby’s Instrumentation Crisis” - Happy to be here at NYC.rb, learned from a lot of people in this room - I’m kind of “workshopping” this talk - some sections might seem a bit under-represented - I’m giving this talk at Rubyconf and would love to hear your thoughts

Slide 2

Slide 2 text

About Me Wednesday, October 10, 12 - My first exposure to Ruby was around ~2005 - I was a Comp-Scie teacher and was looking for a blogging platform, stumbled upon DHH Rails blog demo - Since then I’ve been using Ruby professionally for around 6 years - I love Ruby! I talk a lot of shit about it, but I love it

Slide 3

Slide 3 text

Wednesday, October 10, 12 - I work at Paperless Post - I hack on Ruby there amongst other things, get to work on some hard problems - We’re awesome, and hiring

Slide 4

Slide 4 text

Raise Your Hands Wednesday, October 10, 12 - I just want to get a feel for your experience a little bit - How many of you have instrumented VMs or profiled code in any language? - How many of you have profiled Ruby code? - How many of you have tried to instrument the Ruby VM? Dtrace, SystemTap etc. - How many of you have used a non MRI Ruby? - How many of you deploy Ruby apps to Linux? Which? - Okay, on with it then!

Slide 5

Slide 5 text

This Talk Is Inspired By Wednesday, October 10, 12 - Large Programs

Slide 6

Slide 6 text

LARGE PROGR AMS Wednesday, October 10, 12 - What’s happening under the hood? - For me, the first large programs I needed to profile were production Rails apps - I had experience optimizing code from time doing graphics and sound programming in grad school - I’ve had a taste of that kind of programming attitude - But that’s not what this talk is about

Slide 7

Slide 7 text

This talk is (mostly) about Ruby Wednesday, October 10, 12 - This talk is mostly about Ruby - A few different Rubies actually - I’m going to talk about a lot of different stuff, but basically I want people to think of Ruby more like

Slide 8

Slide 8 text

Wednesday, October 10, 12 - Instead of like - Back and forth a bit

Slide 9

Slide 9 text

Wednesday, October 10, 12 - And with that

Slide 10

Slide 10 text

What is Instrumentation? Wednesday, October 10, 12 - For the purposes of this talk, let’s define our terms - I’ll use instrumentation to refer to the tools used to collect data about running programs - “Instrumentation” technically refers to one of a few different means by which software developers have gathered information about their programs over the years, I’ll talk about the history of instrumentation in a minute.

Slide 11

Slide 11 text

Where did it come from? Wednesday, October 10, 12 - What are the origins of profiling and instrumentation?

Slide 12

Slide 12 text

Around the same time computers got fast, programs got large. Wednesday, October 10, 12

Slide 13

Slide 13 text

Around the same time programs got large, we started to profile them. Wednesday, October 10, 12 - From this you can infer that...

Slide 14

Slide 14 text

Around the same time computers got fast, programs got slow. Wednesday, October 10, 12

Slide 15

Slide 15 text

A brief history, from profiling to instrumentation. Wednesday, October 10, 12 - It started out with counting execution of certain low-level instructions on earliest computers - profiling - Late 70s unix tool prof attempted to make profiling more convenient - profiling - 1982 the gprof paper introduced full call graph analysis, still profiling - 90s, ATOM, instrumentation - 2004 Dtrace introduced

Slide 16

Slide 16 text

Why is it important? Why did you use the word ‘crisis?’ Wednesday, October 10, 12 - So why is it important to instrument code? - Why am I trying to scare people with the word “crisis?” - I’ve been thinking about this for a long time, and after I submitted this talk to RubyConf and it was accepted, I went to Strange Loop and saw this

Slide 17

Slide 17 text

Title Wednesday, October 10, 12 - I know you can’t read this or really see it, I’m a terrible photographer - This is Lars Bak who currently works at Google on V8 and Dart - Historically worked on many VMs inclusing Smalltalk VMs and Hotspot - A veteran of optimizing OO VMs - Here he’s talking about how measuring is how you “Go a lot faster” - My point is

Slide 18

Slide 18 text

Title You can trust Lars Bak Wednesday, October 10, 12

Slide 19

Slide 19 text

Measuring Things: It’s Important Wednesday, October 10, 12 - How I like to state it is: MEASURING THINGS IS IMPORTANT

Slide 20

Slide 20 text

Wednesday, October 10, 12 - That’s what this guy would say

Slide 21

Slide 21 text

So what are we measuring and how do we measure it? Wednesday, October 10, 12 - On a lower level we’re measuring how are program interacts with the underlying system - How much memory is it allocating - How dependant on i/o is it? - On a higher level we’re measuring what Ruby code is being called, and where

Slide 22

Slide 22 text

Ruby VM OS Disk RAM Wednesday, October 10, 12 - So what are we measuring in Ruby land? - Here’s a vast oversimplification of what we’re dealing with - You have your Ruby VM, running on top of your OS, which handles access to your disks and to your memory

Slide 23

Slide 23 text

Ruby VM OS Disk RAM Heap Wednesday, October 10, 12 - Because that’s just a little too simple, remember that there’s also a chunk of memory that is managed by the Ruby VM directly - Ruby grabs memory from the OS in large chunks, because it is an expensive operation - acquiring memory in this way also has its downsides, but that is a little deeper than I want to go right now - These chunks form a segment of memory called the “Heap,” and it is where your Garbage Collectors do their work

Slide 24

Slide 24 text

Ruby VM OS Disk RAM Heap GC Wednesday, October 10, 12 - Let’s extend it just a little more to show that Garbage Collector - It’s not actually a separate process - Different Rubies have different implementations - We want to know what’s inside that Heap, because that’s what makes our programs slow - What’s inside that heap and when our GC runs boils down to when certain internal VM calls are being made - If we had access to these events, we’d have better insight into our VM

Slide 25

Slide 25 text

Some Guidelines for measuring tools Wednesday, October 10, 12 - Now that we’ve thought about what aspects of Ruby we’d like to measure, let’s think about some guidelines for tools for instrumentation - Next slide is DTrace

Slide 26

Slide 26 text

Title Wednesday, October 10, 12 - A slide from the first “real” Sun internal presentation on DTrace - Lays out the attribues a modern tracing framework must have - Dtrace accompishes these by being a system that is deeply integrated with its host kernel - Works on Solaris and BSD, but not on the Linux systems that most of us deploy to

Slide 27

Slide 27 text

State of the art: DTrace Wednesday, October 10, 12 - The previous list of attributes was paraphrased from the research that went into creating DTrace - Let’s look at a slide from the first official internal DTrace presentation at Sun

Slide 28

Slide 28 text

Profiling in other languages Wednesday, October 10, 12 - So how do other languages do it? - Let’s take a look at Smalltalk, Java, Erlang, and a few More

Slide 29

Slide 29 text

For the Spirit: Smalltalk Wednesday, October 10, 12 - Playful - Execute everywhere - “Change the GC algorithm!”

Slide 30

Slide 30 text

Wednesday, October 10, 12 - A Smalltalk GUI that you alter while using it - The kind of spirit I want people to bring to Ruby’s VM

Slide 31

Slide 31 text

For the Strength: Java Wednesday, October 10, 12 - Industrial strength tools - Available and accessible - JVM Hackers - Javaists, Clojurians, Jrubyists

Slide 32

Slide 32 text

Wednesday, October 10, 12 - YourKit - Recommended by David Nolen, who works on core.logic, an insanely amazing library for clojure, a JVM language - Will work on any JVM process out of the box - Started up elasticsearch, was rolling in seconds, can see Memory, Threads, GC, etc. - Inspiring, only one of many tools.

Slide 33

Slide 33 text

For the tools: Objective-C Wednesday, October 10, 12 - DTrace is integrated into OSX - Instruments - A culture of measurement - Hardware hacking attitudes

Slide 34

Slide 34 text

Wednesday, October 10, 12 - My friend was like “oh let me instrument Rdio”

Slide 35

Slide 35 text

Wednesday, October 10, 12 - “Damn it’s context switching like crazy”

Slide 36

Slide 36 text

For the hell of it: Erlang, JS, Go Wednesday, October 10, 12 - Erlang - semantics for process control built into language, OTP framework is awesome, more performance aware as a community - JS - large JS applications running on V8 can be profiled pretty well - JSPerf and microbenches are not all bad, JavaScript programmers are thinking about performance! Node contributes to this as well. - Go - a modern runtime that supports pretty sophisticated insight into process internals, goroutines can report deadlocks, etc

Slide 37

Slide 37 text

And if you’re a Rubyist, you might not do it at all. Wednesday, October 10, 12 - This is a central point I want to address in this talk. - It’s time to engender a community of measuring - Let’s be measurers and shippers at the same time

Slide 38

Slide 38 text

Wednesday, October 10, 12 - Presented without comment

Slide 39

Slide 39 text

Profiling Ruby Wednesday, October 10, 12 - Let’s check into the state of profiling in Ruby land - Many different Ruby implementations - JRuby, Rubinus, Maglev, MRI

Slide 40

Slide 40 text

Leverage the JVM: JRuby Wednesday, October 10, 12 - You can tell from the Java section that I’m jealous of their skills - Leverage the JVM’s tools - Access to JVM profiling points from Ruby code - If you’re willing to accept JVM operations, it’s a great option

Slide 41

Slide 41 text

On the shoulders: Rubinius Wednesday, October 10, 12 - Built from scratch - I’m a huge fan of this project and its authors - The Agent fulfills most if not all of the ideals that this talk has for Ruby VMs - You can read information about the generational GC, JIT, threads, and more

Slide 42

Slide 42 text

Read the manual: MagLev Wednesday, October 10, 12 - Built on a smalltalk VM - A lot of unexplored territory - A lot of lessons we haven’t learned - MagLev gives you access to gemstone/s instrumentation points in Ruby

Slide 43

Slide 43 text

Stranded? MRI Wednesday, October 10, 12 - 1.8.7 had memprof - Not many well maintained tools for 1.9.2 - A struggle to even get DTrace probes into Ruby 2.0 - Are we stranded?

Slide 44

Slide 44 text

Stranded? I think not. Wednesday, October 10, 12 - This guy and I both agree that we’re not stranded. - We just need to regroup, and orient our community toward ideals for performance and measurement - We can contribute to Rubinus, fork MRI if we need to, liberally apply JRuby and MagLev where necessary, and ship it. - One thing we should be doing is considering what our ideal for profiling would be

Slide 45

Slide 45 text

An ideal for profiling Wednesday, October 10, 12 - What would the perfect Ruby instrumentation and profiling tools look like? - DTrace is great but doesn’t cover a majority of Ruby deploys - Let’s get inside the VM

Slide 46

Slide 46 text

Must work in development AND in production. Wednesday, October 10, 12 - And staging, and CI, etc. - A tool during implementation and a tool for real diagnostics

Slide 47

Slide 47 text

What information do we want to see? Wednesday, October 10, 12 - Quick glance stats - # of Objects by Class - Ability to turn potentially impactful profiling “on” and “off” - Object Allocation, by line - How much memory are classes using - GC Information - timing, # of runs - Thread information - See is the key word here - Like a Boundary for the Ruby VM

Slide 48

Slide 48 text

Ruby’s instrumentation future Wednesday, October 10, 12 - What can we learn from Dtrace? - Emphasize performance as much as clarity and testability - Very important - Hack MRI! Hack Rubinius! Hack JRuby! Hack Maglev! - Read papers - Let’s not be afraid of the code, let’s embrace it. - If we can’t navigate the politics, we can cement our own tools

Slide 49

Slide 49 text

Let’s steal everyone’s good ideas! Wednesday, October 10, 12 - from Objective-C and OSX - great looking tooling - from Java - Standardized profiling points - from Erlang - community enforced standards for performance - and on and on

Slide 50

Slide 50 text

...and? Wednesday, October 10, 12 - Ship it!

Slide 51

Slide 51 text

:shipit: Wednesday, October 10, 12 - What really matters - What you’ll be remembered by - How you can show that you care about Ruby

Slide 52

Slide 52 text

Credits: Ruby Drawings by Maya Miller Wednesday, October 10, 12

Slide 53

Slide 53 text

Credits: Awesome help from Aman Gupta, Patrick Thomson, Jesse Cooke, Evan Phoenix, Brian Ford, Elise Huard, Brian Cantrill, Sean Cribbs, Brian Mitchell, Charles Nutter, Tony Arcieri Wednesday, October 10, 12

Slide 54

Slide 54 text

References: https://gist.github.com/3837455 Wednesday, October 10, 12

Slide 55

Slide 55 text

Thanks Wednesday, October 10, 12