Upgrade to Pro — share decks privately, control downloads, hide ads and more …

High Performance Template Engine

High Performance Template Engine

RubyKaigi 2015
http://rubykaigi.org/2015

High Performance Template Engine
A guide to optimizing your Ruby code

Kohei Suzuki, Takashi Kokubun

Takashi Kokubun

December 11, 2015
Tweet

More Decks by Takashi Kokubun

Other Decks in Programming

Transcript

  1. Self introduction • Kohei Suzuki • @eagletmt • Developer Productivity

    Group,
 Cookpad Inc. • Favorite library: pathname
  2. Self introduction • Takashi Kokubun • @k0kubun • Developer Productivity

    Group,
 Cookpad Inc. • Favorite library: ripper
  3. ✤ What is a template engine? ✤ Template Engine Examples

    • Template Engine Internals • Performance • How to optimize Ruby code? • What did we do for high performance template engines?
  4. What is a Template Engine? • Template engines render text

    (typically HTML) by combining data with a template written in a template language • ERB, Haml, Slim, ...
  5. ERB • ERB is a template engine included in the

    Ruby standard library <h1 class='title'><%= @title %></h1> <ul> <%- @items.each do |item| %> <li class='item'><%= item %></li> <% end %> </ul>
  6. Haml %h1{ class: 'title' }= @title %ul - @items.each do

    |item| %li.item= item • Haml is an elegant, structured (X)HTML/XML templating engine
  7. Haml <h1 class='title'>It works!</h1> <ul> <li class='item'>Item 1</li> <li class='item'>Item

    2</li> <li class='item'>Item 3</li> </ul> • Rendered output
  8. Slim h1 class='title' = @title ul - @items.each do |item|

    li.item= item • Slim is a fast, lightweight template engine for Ruby
  9. Slim <h1 class='title'>It works!</h1> <ul> <li class='item'>Item 1</li> <li class='item'>Item

    2</li> <li class='item'>Item 3</li> </ul> • Rendered output
  10. ✤ What is a template engine? • Template Engine Examples

    ✤ Template Engine Internals • Performance • How to optimize Ruby code? • What did we do for high performance template engines?
  11. Template Engine Internals • Template engines compile templates in Ruby

    code Template Ruby code compile %h1 It works! _hamlout.push_text( "<h1>It works!</h1>\n" , 0, false);
  12. Template Engine Internals • Ruby code renders HTML Ruby code

    HTML render _hamlout.push_text( "<h1>It works!</h1>\n" , 0, false); <h1>It works!</h1>
  13. ✤ What is a template engine? • Template Engine Examples

    • Template Engine Internals ✤ Performance • How to optimize Ruby code? • What did we do for high performance template engines?
  14. Haml vs Slim • Haml has nice syntax, but its

    implementation is not very performant • Slim's syntax is not as nice, but it has a great, performant implementation
  15. Faster Haml Engine • We love Haml language, so we

    both implemented faster Haml engines individually w IUUQTHJUIVCDPNFBHMFUNUGBNM w IUUQTHJUIVCDPNLLVCVOIBNMJU
  16. • What is a template engine? ✤ How to optimize

    Ruby code? • What did we do for high performance template engines?
  17. Optimize your Ruby code • YOUR CODE IS SLOW •

    if you don't know how to write fast code
  18. • What is a template engine? ✤ How to optimize

    Ruby code? ✤ Benchmark • Profiling • Improvement • What did we do for high performance template engines?
  19. Why is benchmarking necessary? • To measure performance accurately •

    Profilers have overhead • Even if it is fast in the profiler, it may benchmark slow • For continuous improvement • You can't detect performance regression without benchmark
  20. How to benchmark? • Use benchmark-ips gem • Show a

    result in an easy-to-understand way Rendering of slim/benchmarks with HTML escaped hamlit v2.0.1: 122622.3 i/s faml v0.7.1: 94239.1 i/s - 1.30x slower slim v3.0.6: 89143.0 i/s - 1.38x slower erubis v2.7.0: 65047.8 i/s - 1.89x slower haml v5.0.0.beta.2: 14363.6 i/s - 8.54x slower
  21. What to measure? • Sometimes a problem has a trade-off

    • trade-off between compilation time and rendering time Rendering of haml/test/templates/standard.haml hamlit v2.0.1: 12351.8 i/s (0.081ms) faml v0.7.0: 9713.4 i/s (0.103ms) - 1.27x slower haml v5.0.0.beta.2: 2296.5 i/s (0.435ms) - 5.38x slower
  22. What to measure? • Sometimes a problem has a trade-off

    • trade-off between compilation time and rendering time Compilation of haml/test/templates/standard.haml haml v5.0.0.beta.2: 388.2 i/s (2.576ms) hamlit v2.0.1: 193.7 i/s (5.163ms) - 2.00x slower faml v0.7.0: 188.0 i/s (5.320ms) - 2.07x slower
  23. • What is a template engine? ✤ How to optimize

    Ruby code? • Benchmark ✤ Profiling • Improvement • What did we do for high performance template engines?
  24. Fundamental Rule of Optimisation • Don't guess, measure • It's

    a waste of time to optimize trivial things • The bottleneck may change at any time
  25. stackprof usage in Hamlit repo • To search the entire

    stack to find the bottlenecks in template compilation $ bin/stackprof test/haml/templates/standard.haml ================================== Mode: wall(1) Samples: 8034 (70.35% miss rate) GC: 787 (9.80%) ================================== TOTAL (pct) SAMPLES (pct) FRAME 498 (6.2%) 498 (6.2%) Temple::Mixins::CompiledDispatcher#disp 893 (11.1%) 319 (4.0%) Ripper::Lexer#lex 2999 (37.3%) 237 (2.9%) Hamlit::HTML#dispatcher 2070 (25.8%) 220 (2.7%) Temple::Filters::ControlFlow#dispatcher 4600 (57.3%) 189 (2.4%) Hamlit::Escapable#dispatcher 164 (2.0%) 164 (2.0%) Temple::Mixins::CompiledDispatcher::Dis 174 (2.2%) 160 (2.0%) block in Temple::ImmutableMap#[]
  26. rblineprof usage in Hamlit repo • To find bottlenecks in

    the compiled template code $ bin/lineprof test/haml/templates/standard.haml [Lineprof] ====================================================================== /private/var/folders/my/syd7zn_d495dmjm7_y8lqby80000gp/T/ compiled20151204-39353-9l8fvy | 16 ; _hamlit_compiler1 = ( 1 + 9 + 8 + 2 #numbers should work and this should be ignored; 0.2ms 200 | 17 ; ); _buf << (::Hamlit::Utils.escape_html(((_hamlit_compiler1).to_s))); _buf << ("\n</div>\n<div id='body'> Quotes should be loved! Just like people!</div>\n".freeze); 57.5ms 100 | 18 ; 120.times do |number|; | 19 ; _hamlit_compiler2 = ( number; 31.5ms 24000 | 20 ; ); _buf <<
  27. • What is a template engine? ✤ How to optimize

    Ruby code? • Benchmark • Profiling ✤ Improvement • What did we do for high performance template engines?
  28. How to improve 1. Don't guess, measure (again) • Profiler

    tells you what to optimize • Benchmark tells you which code is faster
  29. How to improve 3. Learn from others • We'll show

    you examples of template engine optimization
  30. • What is a template engine? • How to optimize

    Ruby code? ✤ What did we do for high performance template engines? ✤ Faml side • Hamlit side
  31. Faml • @eagletmt started faml development as a complete replacement

    of haml • High compatibility with improved performance • Basic ideas for high performance: • Follow Slim • Perform optimization at compile time
  32. Slim's Benchmark Compiled benchmark (i/s) 0 20000 40000 60000 80000

    erb slim ugly haml ugly https://travis-ci.org/slim-template/slim/jobs/94130074#L188-L195
  33. Why does Slim perform well? • Slim uses Temple gem

    as backend • Temple performs generic optimization automatically • I decided to use Temple as backend • https://github.com/judofyr/temple
  34. • Haml generates naive Ruby code Haml %a{ href: 'http://rubykaigi.org/2015'

    } _hamlout.buffer << "<a#{ _hamlout.attributes( {}, nil, href: 'http://rubykaigi.org/2015' ) }></a>\n";
  35. Slim a href='http://rubykaigi.org/2015' _buf = []; _buf << ("<a href=\"http://rubykaigi.org/2015\"></a>".freeze);

    ; _buf = _buf.join • Slim generates a static string literal at compile time
  36. • What is a template engine? • How to optimize

    Ruby code? ✤ What did we do for high performance template engines? ✤ Faml side ✤ Attribute Optimization • Faster Runtime Attribute Builder • Hamlit side
  37. Static Analysis • Haml should also be compiled into static

    string literal like Slim • But Ruby parser is required to achieve it %a{ href: 'http://rubykaigi.org/2015' } %a{ :href=>'http://rubykaigi.org/2015' } %a{ 'href'=>'http://rubykaigi.org/2015' } %a{ 'href': 'http://rubykaigi.org/2015' }
  38. parser gem • https://github.com/whitequark/parser • Ruby parser, used by RuboCop,

    Transpec, ... • Easy to use • AST with rich source code information
  39. Attribute Optimization • Faml categorizes attributes into 3 types by

    parsing Ruby code • Static • Dynamic • Runtime
  40. Static Attribute • Both key and value are static •

    Fastest • No operations in runtime %a{ href: 'http://rubykaigi.org/2015' } <a href='http://rubykaigi.org/2015'></a>
  41. Dynamic Attribute %a{ href: url } • Key is static,

    but the value is dynamic • Relatively fast • Escape url and concat it in runtime <a href='http://rubykaigi.org/2015'></a>
  42. Runtime Attribute • Key and value are dynamic • Slow

    • Build whole attribute list in runtime %a{ key => url } <a href='http://rubykaigi.org/2015'></a>
  43. Line Numbers • We have to keep line numbers •

    for correct backtrace • (for correct __LINE__ value)
  44. • It have to be compiled as runtime attributes Line

    Numbers 1 %a{ class: 'link', 2 href: url } 1 buf << ("<a".freeze); _buf << (::Faml::AttributeBuilder.build("'", true, nil, class: 'link', 2 href: url )); _buf << ("></a>\n".freeze); 3 ; _buf = _buf.join
  45. • What is a template engine? • How to optimize

    Ruby code? ✤ What did we do for high performance template engines? ✤ Faml side • Attribute Optimization ✤ Faster Runtime Attribute Builder • Hamlit side
  46. C extension • C is faster than Ruby! • If

    performance is really important, writing C extension is a good choice.
  47. C extension • I wrote runtime attribute builder in C++

    • Ruby version (before v0.1.0) • 41889.8 i/s • C++ version (v0.7.1) • 90168.6 i/s
  48. • What is a template engine? • How to optimize

    Ruby code? ✤ What did we do for high performance template engines? • Faml side ✤ Hamlit side
  49. Hamlit • Designed to defeat Slim • I've heard many

    people said “migrating from Haml to Slim because it's faster.” • Hamlit means “Haml it” (write it with Haml)
  50. Slim's compiled benchmark with HTML-escaping (i/s) 0 35000 70000 105000

    140000 Hamlit Faml Slim Haml https://travis-ci.org/k0kubun/hamlit/jobs/93928561#L247-L251 Hamlit is faster than Slim
  51. Hamlit’s strategy • Reduce string allocation and concatenation by: •

    compiling string interpolation • dropping unused behaviors
  52. • What is a template engine? • How to optimize

    Ruby code? ✤ What did we do for high performance template engines? • Faml side ✤ Hamlit side ✤ Compiling string interpolation • Dropping unused behaviors
  53. How to compile template? • We should care about: 1.

    String allocation 2. String concatenation
  54. 1. String allocation • Utilize frozen string literal • Thanks

    to Temple::Generator, static string is frozen automatically! • Slim, Faml and Hamlit use this
  55. 2. String concatenation • String interpolation is fast Benchmark.ips do

    |x| x.report("Array#join") { ['hello', 1234].join } x.report("interpolation") { "#{'hello'}#{1234}" } x.compare! end
  56. 2. String concatenation • String interpolation is fast $ ruby

    bench.rb Comparison: interpolation: 1115751.8 i/s Array#join: 507283.5 i/s - 2.20x slower
  57. How should we compile interpolated String? • Suppose that you

    are a Ruby interpreter, what code would be pleasant? - year = 2015 %a{ href: "http://rubykaigi.org/#{year}" } RubyKaigi #{year}
  58. How should we compile interpolated String? year = 2015 _hamlout.buffer

    << "<a#{_hamlout.attributes({}, nil, href: "http://rubykaigi.org/#{year}" )}>#{ "RubyKaigi #{Haml::Helpers.html_escape((year))}" }</a>\n"; Haml - year = 2015 %a{ href: "http://rubykaigi.org/#{year}" } RubyKaigi #{year}
  59. How should we compile interpolated String? year = 2015 _hamlout.buffer

    << "<a#{_hamlout.attributes({}, nil, href: "http://rubykaigi.org/#{year}" )}>#{ "RubyKaigi #{Haml::Helpers.html_escape((year))}" }</a>\n"; Haml - year = 2015 %a{ href: "http://rubykaigi.org/#{year}" } RubyKaigi #{year}
  60. How should we compile interpolated String? _buf = []; year

    = 2015; ; _buf << ("<a".freeze); _faml_html1 = ("http://rubykaigi.org/ #{year}"); case (_faml_html1); when true; _buf << (" href".freeze); when false, nil; else; _buf << (" href='".freeze); _buf << (::Temple::Utils.escape_html((_faml_html1))); _buf << ("'".freeze); end; _buf << (">RubyKaigi ".freeze); _buf << (::Temple::Utils.escape_html((year))); _buf << ("</a>\n".freeze); ; _buf = _buf.join Faml - year = 2015 %a{ href: "http://rubykaigi.org/#{year}" } RubyKaigi #{year}
  61. How should we compile interpolated String? _buf = []; year

    = 2015; ; _buf << ("<a".freeze); _faml_html1 = ("http://rubykaigi.org/ #{year}"); case (_faml_html1); when true; _buf << (" href".freeze); when false, nil; else; _buf << (" href='".freeze); _buf << (::Temple::Utils.escape_html((_faml_html1))); _buf << ("'".freeze); end; _buf << (">RubyKaigi ".freeze); _buf << (::Temple::Utils.escape_html((year))); _buf << ("</a>\n".freeze); ; _buf = _buf.join Faml - year = 2015 %a{ href: "http://rubykaigi.org/#{year}" } RubyKaigi #{year}
  62. How should we compile interpolated String? _buf = []; year

    = 2015; ; _buf << ("<a href='http://rubykaigi.org/".freeze); _buf << (::Hamlit::Utils.escape_html((year))); _buf << ("'>RubyKaigi ".freeze); _buf << (::Hamlit::Utils.escape_html((year))); ; _buf << ("</a>\n".freeze); _buf = _buf.join Hamlit - year = 2015 %a{ href: "http://rubykaigi.org/#{year}" } RubyKaigi #{year}
  63. How should we compile interpolated String? _buf = []; year

    = 2015; ; _buf << ("<a href='http://rubykaigi.org/".freeze); _buf << (::Hamlit::Utils.escape_html((year))); _buf << ("'>RubyKaigi ".freeze); _buf << (::Hamlit::Utils.escape_html((year))); ; _buf << ("</a>\n".freeze); _buf = _buf.join Hamlit - year = 2015 %a{ href: "http://rubykaigi.org/#{year}" } RubyKaigi #{year}
  64. How should we compile interpolated String? Comparison: hamlit v2.0.1: 301640.2

    i/s faml v0.7.1: 199001.5 i/s - 1.52x slower haml v5.0.0.beta.2: 14714.4 i/s - 20.50x slower
  65. Tips to write faster code • Don't allocate string •

    Reduce string concatenation • Fastest way to concatenate string is not to concatenate string
  66. • What is a template engine? • How to optimize

    Ruby code? ✤ What did we do for high performance template engines? • Faml side ✤ Hamlit side • Compiling string interpolation ✤ Dropping unused behaviors
  67. Dropping unused behavior • Since Haml and Slim have rich

    syntax and behaviors in attributes, rendering attributes is a bottleneck • In other words, optimization chance
  68. Dropping unused behavior • To optimize attribute rendering, Faml and

    Hamlit drop some unused behavior • Let's see how they are different!
  69. Dropped features in Hamlit • Hamlit supports following features for

    limited attributes • Data attribute hyphenation • Boolean attribute
  70. Data attribute hyphenation • In Haml, nested Hash is expanded

    with hyphen for all attributes <div foo-bar='baz'></div> <div data-bar='baz'></div> %div{ foo: { bar: 'baz' } } %div{ data: { bar: 'baz' } } Haml
  71. • In Faml and Hamlit, data attribute hyphenation is supported

    only for data attribute Data attribute hyphenation Haml Faml, Hamlit <div foo-bar='baz'></div> <div data-bar='baz'></div> <div foo='{:bar=&gt;&quot;baz&quot;}'></div> <div data-bar='baz'></div> %div{ foo: { bar: 'baz' } } %div{ data: { bar: 'baz' } }
  72. Data attribute hyphenation • Hyphenating data attribute is expensive •

    So we dropped it to generate faster code in non-data attributes
  73. ; _buf << ("<input".freeze); case ((_hamlit_compiler1 = (disabled))); when true;

    _buf << (" disabled".freeze); when false, nil; else; _buf << (" disabled='".freeze); _buf << (::Hamlit::Utils.escape_html((_hamlit_compiler1))); _buf << ("'".freeze); end ; _buf << (">\n".freeze); Data attribute hyphenation • No code to hyphenate Hash
  74. Data attribute hyphenation https://travis-ci.org/k0kubun/hamlit/jobs/96207038#L257-L260 - disabled = false %input{ disabled:

    disabled } - disabled = true %input{ disabled: disabled } Comparison: hamlit v2.0.1: 819212.4 i/s (0.001ms) faml v0.7.1: 614993.4 i/s (0.002ms) - 1.33x slower haml v5.0.0.beta.2: 15073.2 i/s (0.066ms) - 54.35x slower • Benchmark for non-data attribute
  75. Boolean support • Only with Hamlit, non-boolean attributes are not

    deleted by falsey values (nil, false) Haml, Faml Hamlit %a{ href: false } %a{ disabled: false } <a></a> <a></a> <a href=''></a> <a></a>
  76. Boolean support • Only with Hamlit, non-boolean attributes are not

    deleted by falsey values (nil, false) • It means that Hamlit doesn't need to check and concatenate value on runtime for non- boolean attributes
  77. Boolean support _buf = []; url = 'http://rubykaigi.org/2015'; ; _buf

    << ("<a".freeze); _faml_html1 = (url); case (_faml_html1); when true; _buf << (" href".freeze); when false, nil; else; _buf << (" href='".freeze); _buf << (::Temple::Utils.escape_html((_faml_html1))); _buf << ("'".freeze); end; _buf << ("></a>\n".freeze); • Faml compilation for non-boolean attribute
  78. Boolean support _buf = []; url = 'http://rubykaigi.org/2015'; ; _buf

    << ("<a href='".freeze); _buf << (::Hamlit::Utils.escape_html((url))); _buf << ("'></a>\n".freeze); _buf = _buf.join • Hamlit compilation for non-boolean attribute
  79. Comparison: hamlit v2.0.1: 407851.9 i/s (0.002ms) faml v0.7.1: 223612.4 i/s

    (0.004ms) - 1.82x slower haml v5.0.0.beta.2: 21823.1 i/s (0.046ms) - 18.69x slower Boolean support
  80. But does it really work? • Also in Rails tag

    helpers, false is not deleted for non-boolean attributes = content_tag :input, '', value: false <input value='false'></input>
  81. • It could pass 20,000+ tests in the World's largest

    Rails application! But does it really work? https://speakerdeck.com/a_matsuda/the-recipe-for-the-worlds-largest-rails-monolith
  82. Why Hamlit is the fastest? • Faml and Slim has

    boolean support for all attributes • So Hamlit is faster in non-boolean attributes • Give up trivial things to make things better!
  83. Comparison of Haml engines • Haml • Slow and rarely

    maintained now • I sent a patch to replace backend, but not merged • Faml • Fast and highly compatible • Hamlit • Fastest and slightly incompatible
  84. Conclusion • How to improve performance • Benchmark, Profiling, Improvement

    • Real examples of improvements • Faml and Hamlit • Try our faster Haml engines!