Upgrade to Pro — share decks privately, control downloads, hide ads and more …

JRuby+Truffle: Why it's important to optimise t...

Chris Seaton
September 01, 2016

JRuby+Truffle: Why it's important to optimise the tricky parts

At the Virtual Machines Summer School (VMSS) 2016

Chris Seaton

September 01, 2016
Tweet

More Decks by Chris Seaton

Other Decks in Research

Transcript

  1. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | JRuby+Truffle Why it’s important to optimise the tricky parts Chris Seaton Research Manager Oracle Labs 2 June 2016 The Ruby Logo is Copyright (c) 2006, Yukihiro Matsumoto. It is licensed under the terms of the Creative Commons Attribution-ShareAlike 2.5 agreement.
  2. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | Safe Harbor Statement The following is intended to provide some insight into a line of research in Oracle Labs. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. Oracle reserves the right to alter its development plans and practices at any time, and the development, release, and timing of any features or functionality described in connection with any Oracle product or service remains at the sole discretion of Oracle. Any views expressed in this presentation are my own and do not necessarily reflect the views of Oracle.
  3. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | Ruby Imperative ‘Scripting’ (Perl) Object-oriented (Smalltalk) Batteries included
  4. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | MRI Simple bytecode interpreter Implemented in C Core library implemented in C
  5. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | The JRuby logo is copyright (c) Tony Price 2011, licensed under the terms of Creative Commons Attribution-NoDerivs 3.0 Unported (CC BY-ND 3.0) JRuby JITs by emitting JVM bytecode VM in Java Core library mostly in Java
  6. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | The Rubinius logo is copyright 2011 Shane Becker, licensed under the terms of Creative Commons Attribution-NoDerivatives 4.0 International — CC BY-ND 4.0 Rubinius JITs by emitting LLVM IR VM in C++ Core library mostly in Ruby
  7. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | Lexer Parser Intermediate rep. Bytecode generator Core library Lexer AST Core library Lexer Parser Lexer Parser Bytecode JIT Primitives Core library Core library
  8. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | Compatibility with the language (spec/ruby) 100%
  9. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | Compatibility with the core library (spec/ruby) 90%
  10. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | ? Why aren’t you using more of JRuby? Such as the existing Java implementation of the core library?
  11. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | Why can’t a conventional VM optimise this? Why can’t JRuby make this as fast as we want?
  12. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | First problem: JRuby’s core library is megamorphic
  13. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | Second problem: JRuby’s core library is stateless
  14. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | Third problem: JRuby’s core library is very deep
  15. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | Fourth problem: JRuby’s core library isn’t amenable to optimisations
  16. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | The same problems apply to Rubinius, even though the core library is mostly written in Ruby
  17. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | x + y * z + x * y z load_local x load_local y load_local z call :* call :+ pushq %rbp movq %rsp, %rbp movq %rdi, -8(%rbp) movq %rsi, -16(%rbp) movq %rdx, -24(%rbp) movq -16(%rbp), %rax movl %eax, %edx movq -24(%rbp), %rax imull %edx, %eax movq -8(%rbp), %rdx addl %edx, %eax popq %rbp ret
  18. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | x + y * z + x * y z load_local x load_local y load_local z call :* call :+ pushq %rbp movq %rsp, %rbp movq %rdi, -8(%rbp) movq %rsi, -16(%rbp) movq %rdx, -24(%rbp) movq -16(%rbp), %rax movl %eax, %edx movq -24(%rbp), %rax imull %edx, %eax movq -8(%rbp), %rdx addl %edx, %eax popq %rbp ret
  19. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | U U U U U Node Rewriting for Profiling Feedback AST Interpreter Uninitialized Nodes Node Transitions S U I D G Uninitialized Integer Generic Double String T. Würthinger, C. Wimmer, A. Wöß, L. Stadler, G. Duboscq, C. Humer, G. Richards, D. Simon, and M. Wolczko. One VM to rule them all. In Proceedings of Onward!, 2013.
  20. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | U U U U U I I I G G Node Rewriting for Profiling Feedback AST Interpreter Rewritten Nodes AST Interpreter Uninitialized Nodes Node Transitions S U I D G Uninitialized Integer Generic Double String T. Würthinger, C. Wimmer, A. Wöß, L. Stadler, G. Duboscq, C. Humer, G. Richards, D. Simon, and M. Wolczko. One VM to rule them all. In Proceedings of Onward!, 2013.
  21. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | I I I G G I I I G G Rewriting ng Feedback AST Interpreter Rewritten Nodes Compilation using Partial Evaluation Compiled Code I D Uninitialized Integer Generic Double T. Würthinger, C. Wimmer, A. Wöß, L. Stadler, G. Duboscq, C. Humer, G. Richards, D. Simon, and M. Wolczko. One VM to rule them all. In Proceedings of Onward!, 2013.
  22. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | 55 codon.com/compilers-for-free Presentation, by Tom Stuart, licensed under a Creative Commons Attribution ShareAlike3.0
  23. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | U U U U U I I I G G I I I G G Node Rewriting for Profiling Feedback AST Interpreter Rewritten Nodes AST Interpreter Uninitialized Nodes Compilation using Partial Evaluation Compiled Code Node Transitions S U I D G Uninitialized Integer Generic Double String T. Würthinger, C. Wimmer, A. Wöß, L. Stadler, G. Duboscq, C. Humer, G. Richards, D. Simon, and M. Wolczko. One VM to rule them all. In Proceedings of Onward!, 2013.
  24. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | 03/06/2016 T. Würthinger, C. Wimmer, A. Wöß, L. Stadler, G. Duboscq, C. Humer, G. Richards, D. Simon, and M. Wolczko. One VM to rule them all. In Proceedings of Onward!, 2013. I I I G G I I I G G Deoptimization to AST Interpreter D I Node Rewriting to Update Profiling Feedback
  25. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | 03/06/2016 Oracle Confidential – Internal/Restricted/Highly Restricted T. Würthinger, C. Wimmer, A. Wöß, L. Stadler, G. Duboscq, C. Humer, G. Richards, D. Simon, and M. Wolczko. One VM to rule them all. In Proceedings of Onward!, 2013. I I G G D I D G G D I D G G Node Rewriting to Update Profiling Feedback Recompilation using Partial Evaluation
  26. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | 03/06/2016 Oracle Confidential – Internal/Restricted/Highly Restricted T. Würthinger, C. Wimmer, A. Wöß, L. Stadler, G. Duboscq, C. Humer, G. Richards, D. Simon, and M. Wolczko. One VM to rule them all. In Proceedings of Onward!, 2013. I I I G G I I I G G Deoptimization to AST Interpreter D I D G G D I D G G Node Rewriting to Update Profiling Feedback Recompilation using Partial Evaluation
  27. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | x + y * z + x * y z load_local x load_local y load_local z call :* call :+ pushq %rbp movq %rsp, %rbp movq %rdi, -8(%rbp) movq %rsi, -16(%rbp) movq %rdx, -24(%rbp) movq -16(%rbp), %rax movl %eax, %edx movq -24(%rbp), %rax imull %edx, %eax movq -8(%rbp), %rdx addl %edx, %eax popq %rbp ret
  28. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | x + y * z + x * y z load_local x load_local y load_local z call :* call :+ pushq %rbp movq %rsp, %rbp movq %rdi, -8(%rbp) movq %rsi, -16(%rbp) movq %rdx, -24(%rbp) movq -16(%rbp), %rax movl %eax, %edx movq -24(%rbp), %rax imull %edx, %eax movq -8(%rbp), %rdx addl %edx, %eax popq %rbp ret
  29. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | Will I be able to use Truffle and Graal for real?
  30. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | Hotspot Graal Truffle JS Ruby R Java C++ JVMCI (JVM Compiler Interface)
  31. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | Hotspot Graal Truffle JS Ruby R via Maven etc Java 9
  32. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | How Truffle solves the problem of optimising Ruby
  33. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | First problem: JRuby’s core library is megamorphic
  34. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | U U U U U I I I G G I I Node Rewriting for Profiling Feedback AST Interpreter Rewritten Nodes AST Interpreter Uninitialized Nodes Compilation using Partial Evaluation Compi Node Transitions S U I D G Uninitialized Integer Generic Double String T. Würthinger, C. Wimmer, A. Wöß, L. Stadler, G. Duboscq, C. Humer, G. Richards, D. Simon, and M. Wolczko. One VM to rule them all. In Proceedings of Onward!, 2013.
  35. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | T. Würthinger, C. Wimmer, A. Wöß, L. Stadler, G. Duboscq, C. Humer, G. Richards, D. Simon, and M. Wolczko. One VM to rule them all. In Proceedings of Onward!, 2013.
  36. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | Second problem: JRuby’s core library is stateless
  37. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | I I I G G I I I G G Node Rewriting for Profiling Feedback AST Interpreter Rewritten Nodes Compilation using Partial Evaluation Compiled Code Node Transitions S U I D G Uninitialized Integer Generic Double String T. Würthinger, C. Wimmer, A. Wöß, L. Stadler, G. Duboscq, C. Humer, G. Richards, D. Simon, and M. Wolczko. One VM to rule them all. In Proceedings of Onward!, 2013.
  38. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | Third problem: JRuby’s core library is very deep
  39. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | Fourth problem: JRuby’s core library isn’t amenable to optimisations
  40. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | 89 def min(a, b) [a, b].sort[0] end puts min(2, 8)
  41. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | 90 def min(a, b) [a, b].sort[0] end puts [2, 8].sort[0]
  42. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | 91 t0 = 2 <=> 8 t1 = t0 < 0 ? 2 : 8 t2 = t0 > 0 ? 8 : 2 t3 = [t1, t2] puts t3[0]
  43. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | 92 t0 = 2 <=> 8 t1 = t0 < 0 ? 2 : 8 t2 = t0 > 0 ? 8 : 2 t3 = [t1, t2] puts t1
  44. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | 99 t0 = a <=> b t1 = t0 < 0 ? a : b puts t1
  45. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | 100 t0 = a <=> b t1 = (a <=> b) < 0 ? a : b puts t1
  46. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | 101 t1 = (a <=> b) < 0 ? a : b puts (a <=> b) < 0 ? a : b
  47. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | C extensions are a hack to workaround performance, but now they stop us really fixing performance
  48. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | A lot of this has been about removing barriers to the excellent optimisations we already have
  49. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | 126 cmyk_to_rgb psd_native_util_clamp FIX2INT
  50. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | 127 cmyk_to_rgb psd_native_util_clamp FIX2INT
  51. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | 0 5 10 15 20 25 30 35 Average Speedup Relative to MRI Without C Extension (s/s) C Extension Performance for psd_native and oily_png Matthias Grimmer, Chris Seaton, Thomas Wuerthinger, Hanspeter Moessenboeck: Dynamically Composing Languages in a Modular Way: Supporting C Extensions for Dynamic Languages Modularity '14 Proceedings of the 14th International Conference on Modularity
  52. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | The blocker for performance of idiomatic Ruby code is the core library, not basic language features
  53. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | This extends to everything that forms a barrier – including C extensions
  54. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | Specialisation, splitting, inlining, partial evaluation, inline caching are all solutions to this problem
  55. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | Truffle makes it easy to add these to a language implementation
  56. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | Can result in an order of magnitude performance increase with reasonable effort
  57. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | 136 Acknowledgements Benoit Daloze Kevin Menard Petr Chalupa Oracle Labs Danilo Ansaloni Stefan Anzinger Daniele Bonetta Matthias Brantner Laurent Daynès Gilles Duboscq Michael Haupt Christian Humer Mick Jordan Peter Kessler Hyunjin Lee David Leibs Tom Rodriguez Roland Schatz Chris Seaton Doug Simon Lukas Stadler Oracle Labs (continued) Michael Van de Vanter Adam Welc Till Westmann Christian Wimmer Christian Wirth Paul Wögerer Mario Wolczko Andreas Wöß Thomas Würthinger Oracle Labs Interns Shams Imam Stephen Kell Gero Leinemann Julian Lettner Gregor Richards Robert Seilbeck Rifat Shariyar Oracle Labs Alumni Erik Eckstein Christos Kotselidi JKU Linz Prof. Hanspeter Mössenböck Josef Eisl Thomas Feichtinger Matthias Grimmer Christian Häub Josef Haider Christian Hube David Leopoltsederr Manuel Rigger Stefan Rumzucker Bernhard Urban University of Edinburgh Christophe Dubach Juan José Fumero Alfonso Ranjeet Singh Toomas Remmelg LaBRI Floréal Morandat University of California, Irvine Prof. Michael Franz Codrut Stancu Gulfem Savrun Yeniceri Wei Zhang Purdue University Prof. Jan Vitek Tomas Kalibera Romand Tsegelskyi Prahlad Joshi Petr Maj Lei Zhao T. U. Dortmund Prof. Peter Marwedel Helena Kotthaus Ingo Korb University of California, Davis Prof. Duncan Temple Lang Nicholas Ulle
  58. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | Safe Harbor Statement The preceding is intended to provide some insight into a line of research in Oracle Labs. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. Oracle reserves the right to alter its development plans and practices at any time, and the development, release, and timing of any features or functionality described in connection with any Oracle product or service remains at the sole discretion of Oracle. Any views expressed in this presentation are my own and do not necessarily reflect the views of Oracle.