Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Why Python, Ruby, and Javascript are Slow

Alex Gaynor
February 28, 2013

Why Python, Ruby, and Javascript are Slow

Talk as delivered at Waza, 2013 in San Francisco.

Alex Gaynor

February 28, 2013
Tweet

More Decks by Alex Gaynor

Other Decks in Programming

Transcript

  1. Thursday, February 28, 13

    View Slide

  2. Why Python, Ruby, and
    Javascript are slow
    Alex Gaynor
    Waza 2013
    Thursday, February 28, 13
    Hi everyone, thanks for coming here on such short notice, I know there’s a lot of way more
    talented speakers you could be seeing right now.

    View Slide

  3. You may know me
    from...
    Thursday, February 28, 13
    So, first who am I. Here are some places you may have seen me.

    View Slide

  4. rdio.com
    Thursday, February 28, 13
    I work for a company called rdio, we do streaming internet music. We’re awesome, we’re
    hiring, even if you don’t want a job you should totally use our product.

    View Slide

  5. • CPython
    • Django
    • PyPy
    Thursday, February 28, 13
    I also write a lot of open source. I write primarily in Python, web stuff, compilers, I like
    infrastructure type projects.

    View Slide

  6. Twitter rants about
    how computers are bad
    Thursday, February 28, 13
    If you have the misfortune to follow me on twitter, I tweet about how computers are bad,
    software is bad, and we should all feel bad.

    View Slide

  7. Topaz
    topazruby.com
    Thursday, February 28, 13
    And recently I open source topaz. A high performance implementation of Ruby on top of
    RPython, which is the toolchain that powers PyPy. It’s pretty sweet, check it out and
    contribute!

    View Slide

  8. There is no benchmark
    but your benchmark
    Thursday, February 28, 13
    If you’ve read PyPy is X times faster than CPython. Or C is Y times faster than Java or any
    other number like that. IGNORE THEM. These are myths, lies, and inappropriate attempts to
    use statistics.

    View Slide

  9. Lame Excuses about
    why they’re Slow
    Thursday, February 28, 13
    So we’re going to talk about why a bunch of dynamic languages (and yes this basically
    applies to all of them, and also a bunch of statically typed languages) are slow. First let’s look
    at the common wisdom about why they’re slow and see where that gets us (hint: this slide is
    telling you).

    View Slide

  10. Dynamic Typing
    Thursday, February 28, 13
    So first bit of common wisdom: dynamic typing, it makes stuff slow because the compiler
    can’t optimize stuff based on types. This is complete and utter bollocks. Dynamic typing is
    not a problem. We have good ways to solve it, like tracing JITs, method JITs with runtime type
    feedback, and predictive type infrencers.

    View Slide

  11. “You can monkey patch
    anything”
    Thursday, February 28, 13
    Again, not a problem. When you’re building a JIT, it’s basically a solved problem, it’s called
    “deoptimization” and basically that means, generate really optimized code, but still check
    your assumptions, if they’re wrong you bail out. Most advanced JITs, like RPython (as seen in
    PyPy and Topaz), and v8 have this.

    View Slide

  12. Harder to Optimize
    vs.
    Slow
    Thursday, February 28, 13
    It’s harder to write a good VM for these languages because of those features. It’s hard to
    make reading an attribute in Python as fast as reading a field out of a struct in C. But we’ve
    done that so now we’re going to talk about why stuff is REALLY slow.

    View Slide

  13. The Truth
    Thursday, February 28, 13
    In truth, there are two things that can cause code to be slow: data structures and algorithms.
    Those are of course so broad, so general that they could apply to anything. But let’s start
    with a core point: I can run a line of Python or Ruby as fast as you can run a line of C or Java.
    It’s all about what that line does.

    View Slide

  14. Let’s talk about C
    Thursday, February 28, 13
    So C is fast, it’s basically a DSL for assembly, right? So C is fast because it’s basically like x86
    instructions. But comparing what we do in it with what we do in higher level languages can
    make stuff clear.

    View Slide

  15. struct Point {
    double x;
    double y;
    double z;
    };
    Thursday, February 28, 13
    This is defining a Point struct in C. It’s got 3 fields, they’re all doubles.

    View Slide

  16. class Point(object):
    def __init__(self, x, y, z):
    self.x = x
    self.y = y
    self.z = z
    Thursday, February 28, 13
    This is a reasonable Python version of that. Some obvious differences:
    * no type declarations
    * you can add any other field you want to point instances
    * point instances are always allocated by the GC on the heap
    NONE OF THAT MATTERS, I WRITE COMPILERS I’LL DEAL WITH THAT

    View Slide

  17. data = {
    "x" x,
    "y": y,
    "z": z,
    }
    Thursday, February 28, 13
    Here’s what I see people do a lot. This is a dictionary. Dictionaries are not objects,
    dictionaries are hash tables.

    View Slide

  18. Dictionary
    vs.
    Object
    Thursday, February 28, 13
    A dictionary is a mapping of arbitrary keys (usually of homogenous type) to arbirary values.
    An object is thing with a fixed set of properties and behaviors. You should never confuse
    them even if the performance difference didn’t exist, these are different things.

    View Slide

  19. std::hash_set point;
    point["x"] = x;
    point["y"] = y;
    point["z"] = z;
    Thursday, February 28, 13
    I wanted to use a pure C hash table implementation, but googling “C hash table” didn’t bring
    up useful stuff. Anyways, if you put this in a code review, first your coworkers would laugh a
    lot, and then they’d make mean jokes about you for the next year. It looks ridiculous, and it
    is ridiculous.

    View Slide

  20. And it would be slow
    Thursday, February 28, 13
    If you mixed up hash tables and objects in C or the like, it would be idiotically slow.

    View Slide

  21. Why don’t people care?
    Thursday, February 28, 13
    On naive implementations of these languages, that is CPython, MRI, and most JS
    implementations up until 5 years ago. There’s no performance difference. When you have a
    real JIT though, the difference is huge. They train themselves to think of a dictionary as a
    lightweight object as a weird dogma. And in Javascript it’s all one type so that’s a mess
    anyways.

    View Slide

  22. Let’s talk about strings
    Thursday, February 28, 13
    Strings are probably the most used datastructure in all of programming. You use them
    EVERYWHERE. It’s pretty rare to have a single function that doesn’t have a string in it.

    View Slide

  23. Given a string
    matching: “\w+-\d+”
    return the integral part
    of the value
    Thursday, February 28, 13
    This type of problem is a pretty common practical parsing task.

    View Slide

  24. int(s.split("-", 1)[1])
    Thursday, February 28, 13
    This is probably how like 99% of python people would solve it. Split the string on the dash,
    get the second part, convert to an int. Let’s talk about efficiency. split is going to allocate a
    list of 2 elements, allocate 2 strings, do 2 copies, and then convert one to an int, throwing
    the rest away.

    View Slide

  25. atoi(strchr(s, '-') + 1)
    Thursday, February 28, 13
    What does this do? Finds the first instance of a -, and converts the remainder of a string to an
    int. 0 allocations, 0 copies. Doing this with 0 copies is pretty much impossible in Python, and
    probably in ruby and Javascript too.

    View Slide

  26. Things that take time
    • Hash table lookups
    • Allocations
    • Copying
    Thursday, February 28, 13
    The JIT will take care of the accidental ones, but the ones that are fundamental part of your
    algorithm, you need to deal with those.

    View Slide

  27. The C way:
    BYOB
    Thursday, February 28, 13
    Bring Your Own Buffer. In, C no stdlib function allocates a string basically, you’re always
    responsible for bringing a buffer to allocate data into. In practice this means C programmers
    tends to work on a single allocated block of memory.

    View Slide

  28. char *data = malloc(1024);
    while (true) {
    read(fd, data, 1024);
    char *start = data;
    while (start < data + 1024) {
    if (isspace(*start)) {
    break;
    }
    start++;
    }
    printf("%s\n", start);
    }
    Thursday, February 28, 13
    A C funciton that reads data from a file descriptor, and then prints each chunk with the
    leading spaces removed. It does one allocation the whole time.

    View Slide

  29. while True:
    data = os.read(fd, 1024)
    print data.lstrip()
    Thursday, February 28, 13
    The Python version is much prettier, much shorter, and much slower. It does multiple
    allocations per iteration of the loop. We can easily imagine if we wanted to add a .lower() to
    print the in lowercase, now there’s an extra allocation and an extra copy in the loop, whereas
    the C one could easily add that with no additional copying.

    View Slide

  30. long *squares(long n) {
    long *sq = malloc(sizeof(long) * n);
    for (long i = 0; i < n; i++) {
    sq[i] = i * i;
    }
    return sq;
    }
    Thursday, February 28, 13
    This is a basically idiomatic C function to return an array of squares in C. One allocation, no
    copying.

    View Slide

  31. def squares(n):
    sq = []
    for i in xrange(n):
    sq.append(i * i)
    return sq
    Thursday, February 28, 13
    A basically idiomatic version of the same in Python. No list pre-allocation, so every iteration
    through the list we have the potential to need to resize the list and copy all the data. That’s
    inefficient.

    View Slide

  32. Missing APIs
    Thursday, February 28, 13
    Python, Ruby, Javascript, it’s not that there’s any necessary reason for these inefficiency. It’s
    that we’re missing APIs, and we could have GREAT APIs for this. PyPy has some secret ones:

    View Slide

  33. from __pypy__ import newlist_hint
    def squares(n):
    sq = newlist_hint(n)
    for i in xrange(n):
    sq.append(i * i)
    return sq
    Thursday, February 28, 13
    newlist_hint returns you a list that looks totally normal, but internally it’s been preallocated.
    For building large lists this is way faster, with 0 loss of power, 0 loss of flexibility.

    View Slide

  34. Don’t make us add
    heuristics
    Thursday, February 28, 13
    As a compile author I’m begging you: don’t make dynamic languages compilers add crazy
    heuristics. We need good APIs in Python, Ruby, and Javascript for preallocating data, for not
    copying stuff left and right. And we can build them, it’s way easier to make a kickass API for
    something in any of these languages than C.

    View Slide

  35. Heuristics = WAG
    Thursday, February 28, 13
    WAG = Wild Ass Guess. And that’s what heuristics devolve to. But if we, as a community,
    don’t embrace a set of idioms for high performance Python, then as a compiler author this is
    what I’ll end up writing, and it’ll make me sad.

    View Slide

  36. Growing divide
    between optimizing and
    not
    Thursday, February 28, 13
    Code that is optimal for PyPy is not for CPython, and this divide is growing wider as we add
    more optimizations, and the same is true of Ruby and Javascript VMs. As a dynamic language
    community we need to get our act together on this front.

    View Slide

  37. Recap
    • Line for line these languages are fast!
    • Take care in data structures (data structure
    heuristics are the WORST)
    • We need better no-copy/preallocate APIs
    Thursday, February 28, 13

    View Slide

  38. Don’t abandon beauty,
    simplicity, our values for
    performance
    Make performance
    beautiful.
    Thursday, February 28, 13

    View Slide

  39. Thank you!
    https://speakerdeck.com/alex
    @alex_gaynor
    Thursday, February 28, 13
    I’m told there’s no official Q/A so please come find me in the halls and bug me, I’m super
    excited to talk about this stuff, PyPy, Topaz, web application architecture, pretty much
    anything.

    View Slide

  40. If there’s time
    • Java collections vs. Array and Hash. Need
    more choices.
    • Stop writing C extensions, use something
    like cffi
    • Teach good benchmarking practices
    Thursday, February 28, 13

    View Slide