Why Python, Ruby, and Javascript are Slow

Thursday, February 28, 13

Why Python, Ruby, and Javascript are slow Alex Gaynor Waza
2013 Thursday, February 28, 13 Hi everyone, thanks for coming here on such short notice, I know there’s a lot of way more talented speakers you could be seeing right now.

You may know me from... Thursday, February 28, 13 So,
ﬁrst who am I. Here are some places you may have seen me.

rdio.com Thursday, February 28, 13 I work for a company
called rdio, we do streaming internet music. We’re awesome, we’re hiring, even if you don’t want a job you should totally use our product.

• CPython • Django • PyPy Thursday, February 28, 13
I also write a lot of open source. I write primarily in Python, web stuff, compilers, I like infrastructure type projects.

Twitter rants about how computers are bad Thursday, February 28,
13 If you have the misfortune to follow me on twitter, I tweet about how computers are bad, software is bad, and we should all feel bad.

Topaz topazruby.com Thursday, February 28, 13 And recently I open
source topaz. A high performance implementation of Ruby on top of RPython, which is the toolchain that powers PyPy. It’s pretty sweet, check it out and contribute!

There is no benchmark but your benchmark Thursday, February 28,
13 If you’ve read PyPy is X times faster than CPython. Or C is Y times faster than Java or any other number like that. IGNORE THEM. These are myths, lies, and inappropriate attempts to use statistics.

Lame Excuses about why they’re Slow Thursday, February 28, 13
So we’re going to talk about why a bunch of dynamic languages (and yes this basically applies to all of them, and also a bunch of statically typed languages) are slow. First let’s look at the common wisdom about why they’re slow and see where that gets us (hint: this slide is telling you).

Dynamic Typing Thursday, February 28, 13 So ﬁrst bit of
common wisdom: dynamic typing, it makes stuff slow because the compiler can’t optimize stuff based on types. This is complete and utter bollocks. Dynamic typing is not a problem. We have good ways to solve it, like tracing JITs, method JITs with runtime type feedback, and predictive type infrencers.

“You can monkey patch anything” Thursday, February 28, 13 Again,
not a problem. When you’re building a JIT, it’s basically a solved problem, it’s called “deoptimization” and basically that means, generate really optimized code, but still check your assumptions, if they’re wrong you bail out. Most advanced JITs, like RPython (as seen in PyPy and Topaz), and v8 have this.

Harder to Optimize vs. Slow Thursday, February 28, 13 It’s
harder to write a good VM for these languages because of those features. It’s hard to make reading an attribute in Python as fast as reading a ﬁeld out of a struct in C. But we’ve done that so now we’re going to talk about why stuff is REALLY slow.

The Truth Thursday, February 28, 13 In truth, there are
two things that can cause code to be slow: data structures and algorithms. Those are of course so broad, so general that they could apply to anything. But let’s start with a core point: I can run a line of Python or Ruby as fast as you can run a line of C or Java. It’s all about what that line does.

Let’s talk about C Thursday, February 28, 13 So C
is fast, it’s basically a DSL for assembly, right? So C is fast because it’s basically like x86 instructions. But comparing what we do in it with what we do in higher level languages can make stuff clear.

struct Point { double x; double y; double z; };
Thursday, February 28, 13 This is deﬁning a Point struct in C. It’s got 3 ﬁelds, they’re all doubles.

class Point(object): def __init__(self, x, y, z): self.x = x
self.y = y self.z = z Thursday, February 28, 13 This is a reasonable Python version of that. Some obvious differences: * no type declarations * you can add any other ﬁeld you want to point instances * point instances are always allocated by the GC on the heap NONE OF THAT MATTERS, I WRITE COMPILERS I’LL DEAL WITH THAT

data = { "x" x, "y": y, "z": z, }
Thursday, February 28, 13 Here’s what I see people do a lot. This is a dictionary. Dictionaries are not objects, dictionaries are hash tables.

Dictionary vs. Object Thursday, February 28, 13 A dictionary is
a mapping of arbitrary keys (usually of homogenous type) to arbirary values. An object is thing with a ﬁxed set of properties and behaviors. You should never confuse them even if the performance difference didn’t exist, these are different things.

std::hash_set<std::string, double> point; point["x"] = x; point["y"] = y; point["z"]
= z; Thursday, February 28, 13 I wanted to use a pure C hash table implementation, but googling “C hash table” didn’t bring up useful stuff. Anyways, if you put this in a code review, ﬁrst your coworkers would laugh a lot, and then they’d make mean jokes about you for the next year. It looks ridiculous, and it is ridiculous.

And it would be slow Thursday, February 28, 13 If
you mixed up hash tables and objects in C or the like, it would be idiotically slow.

Why don’t people care? Thursday, February 28, 13 On naive
implementations of these languages, that is CPython, MRI, and most JS implementations up until 5 years ago. There’s no performance difference. When you have a real JIT though, the difference is huge. They train themselves to think of a dictionary as a lightweight object as a weird dogma. And in Javascript it’s all one type so that’s a mess anyways.

Let’s talk about strings Thursday, February 28, 13 Strings are
probably the most used datastructure in all of programming. You use them EVERYWHERE. It’s pretty rare to have a single function that doesn’t have a string in it.

Given a string matching: “\w+-\d+” return the integral part of
the value Thursday, February 28, 13 This type of problem is a pretty common practical parsing task.

int(s.split("-", 1)[1]) Thursday, February 28, 13 This is probably how
like 99% of python people would solve it. Split the string on the dash, get the second part, convert to an int. Let’s talk about efficiency. split is going to allocate a list of 2 elements, allocate 2 strings, do 2 copies, and then convert one to an int, throwing the rest away.

atoi(strchr(s, '-') + 1) Thursday, February 28, 13 What does
this do? Finds the ﬁrst instance of a -, and converts the remainder of a string to an int. 0 allocations, 0 copies. Doing this with 0 copies is pretty much impossible in Python, and probably in ruby and Javascript too.

Things that take time • Hash table lookups • Allocations
• Copying Thursday, February 28, 13 The JIT will take care of the accidental ones, but the ones that are fundamental part of your algorithm, you need to deal with those.

The C way: BYOB Thursday, February 28, 13 Bring Your
Own Buffer. In, C no stdlib function allocates a string basically, you’re always responsible for bringing a buffer to allocate data into. In practice this means C programmers tends to work on a single allocated block of memory.

char *data = malloc(1024); while (true) { read(fd, data, 1024);
char *start = data; while (start < data + 1024) { if (isspace(*start)) { break; } start++; } printf("%s\n", start); } Thursday, February 28, 13 A C funciton that reads data from a ﬁle descriptor, and then prints each chunk with the leading spaces removed. It does one allocation the whole time.

while True: data = os.read(fd, 1024) print data.lstrip() Thursday, February
28, 13 The Python version is much prettier, much shorter, and much slower. It does multiple allocations per iteration of the loop. We can easily imagine if we wanted to add a .lower() to print the in lowercase, now there’s an extra allocation and an extra copy in the loop, whereas the C one could easily add that with no additional copying.

long *squares(long n) { long *sq = malloc(sizeof(long) * n);
for (long i = 0; i < n; i++) { sq[i] = i * i; } return sq; } Thursday, February 28, 13 This is a basically idiomatic C function to return an array of squares in C. One allocation, no copying.

def squares(n): sq = [] for i in xrange(n): sq.append(i
* i) return sq Thursday, February 28, 13 A basically idiomatic version of the same in Python. No list pre-allocation, so every iteration through the list we have the potential to need to resize the list and copy all the data. That’s inefficient.

Missing APIs Thursday, February 28, 13 Python, Ruby, Javascript, it’s
not that there’s any necessary reason for these inefficiency. It’s that we’re missing APIs, and we could have GREAT APIs for this. PyPy has some secret ones:

from __pypy__ import newlist_hint def squares(n): sq = newlist_hint(n) for
i in xrange(n): sq.append(i * i) return sq Thursday, February 28, 13 newlist_hint returns you a list that looks totally normal, but internally it’s been preallocated. For building large lists this is way faster, with 0 loss of power, 0 loss of ﬂexibility.

Don’t make us add heuristics Thursday, February 28, 13 As
a compile author I’m begging you: don’t make dynamic languages compilers add crazy heuristics. We need good APIs in Python, Ruby, and Javascript for preallocating data, for not copying stuff left and right. And we can build them, it’s way easier to make a kickass API for something in any of these languages than C.

Heuristics = WAG Thursday, February 28, 13 WAG = Wild
Ass Guess. And that’s what heuristics devolve to. But if we, as a community, don’t embrace a set of idioms for high performance Python, then as a compiler author this is what I’ll end up writing, and it’ll make me sad.

Growing divide between optimizing and not Thursday, February 28, 13
Code that is optimal for PyPy is not for CPython, and this divide is growing wider as we add more optimizations, and the same is true of Ruby and Javascript VMs. As a dynamic language community we need to get our act together on this front.

Recap • Line for line these languages are fast! •
Take care in data structures (data structure heuristics are the WORST) • We need better no-copy/preallocate APIs Thursday, February 28, 13

Don’t abandon beauty, simplicity, our values for performance Make performance
beautiful. Thursday, February 28, 13

Thank you! https://speakerdeck.com/alex @alex_gaynor Thursday, February 28, 13 I’m told
there’s no official Q/A so please come ﬁnd me in the halls and bug me, I’m super excited to talk about this stuff, PyPy, Topaz, web application architecture, pretty much anything.

If there’s time • Java collections vs. Array and Hash.
Need more choices. • Stop writing C extensions, use something like cfﬁ • Teach good benchmarking practices Thursday, February 28, 13

Why Python, Ruby, and Javascript are Slow

Why Python, Ruby, and Javascript are Slow

More Decks by Alex Gaynor

Other Decks in Programming

Featured

Transcript