Upgrade to Pro — share decks privately, control downloads, hide ads and more …

High Performance Template Engine

High Performance Template Engine

RubyKaigi 2015
http://rubykaigi.org/2015

High Performance Template Engine
A guide to optimizing your Ruby code

Kohei Suzuki, Takashi Kokubun

Avatar for Takashi Kokubun

Takashi Kokubun

December 11, 2015
Tweet

More Decks by Takashi Kokubun

Other Decks in Programming

Transcript

  1. Self introduction • Kohei Suzuki • @eagletmt • Developer Productivity

    Group,
 Cookpad Inc. • Favorite library: pathname
  2. Self introduction • Takashi Kokubun • @k0kubun • Developer Productivity

    Group,
 Cookpad Inc. • Favorite library: ripper
  3. ✤ What is a template engine? ✤ Template Engine Examples

    • Template Engine Internals • Performance • How to optimize Ruby code? • What did we do for high performance template engines?
  4. What is a Template Engine? • Template engines render text

    (typically HTML) by combining data with a template written in a template language • ERB, Haml, Slim, ...
  5. ERB • ERB is a template engine included in the

    Ruby standard library <h1 class='title'><%= @title %></h1> <ul> <%- @items.each do |item| %> <li class='item'><%= item %></li> <% end %> </ul>
  6. Haml %h1{ class: 'title' }= @title %ul - @items.each do

    |item| %li.item= item • Haml is an elegant, structured (X)HTML/XML templating engine
  7. Haml <h1 class='title'>It works!</h1> <ul> <li class='item'>Item 1</li> <li class='item'>Item

    2</li> <li class='item'>Item 3</li> </ul> • Rendered output
  8. Slim h1 class='title' = @title ul - @items.each do |item|

    li.item= item • Slim is a fast, lightweight template engine for Ruby
  9. Slim <h1 class='title'>It works!</h1> <ul> <li class='item'>Item 1</li> <li class='item'>Item

    2</li> <li class='item'>Item 3</li> </ul> • Rendered output
  10. ✤ What is a template engine? • Template Engine Examples

    ✤ Template Engine Internals • Performance • How to optimize Ruby code? • What did we do for high performance template engines?
  11. Template Engine Internals • Template engines compile templates in Ruby

    code Template Ruby code compile %h1 It works! _hamlout.push_text( "<h1>It works!</h1>\n" , 0, false);
  12. Template Engine Internals • Ruby code renders HTML Ruby code

    HTML render _hamlout.push_text( "<h1>It works!</h1>\n" , 0, false); <h1>It works!</h1>
  13. ✤ What is a template engine? • Template Engine Examples

    • Template Engine Internals ✤ Performance • How to optimize Ruby code? • What did we do for high performance template engines?
  14. Haml vs Slim • Haml has nice syntax, but its

    implementation is not very performant • Slim's syntax is not as nice, but it has a great, performant implementation
  15. Faster Haml Engine • We love Haml language, so we

    both implemented faster Haml engines individually w IUUQTHJUIVCDPNFBHMFUNUGBNM w IUUQTHJUIVCDPNLLVCVOIBNMJU
  16. • What is a template engine? ✤ How to optimize

    Ruby code? • What did we do for high performance template engines?
  17. Optimize your Ruby code • YOUR CODE IS SLOW •

    if you don't know how to write fast code
  18. • What is a template engine? ✤ How to optimize

    Ruby code? ✤ Benchmark • Profiling • Improvement • What did we do for high performance template engines?
  19. Why is benchmarking necessary? • To measure performance accurately •

    Profilers have overhead • Even if it is fast in the profiler, it may benchmark slow • For continuous improvement • You can't detect performance regression without benchmark
  20. How to benchmark? • Use benchmark-ips gem • Show a

    result in an easy-to-understand way Rendering of slim/benchmarks with HTML escaped hamlit v2.0.1: 122622.3 i/s faml v0.7.1: 94239.1 i/s - 1.30x slower slim v3.0.6: 89143.0 i/s - 1.38x slower erubis v2.7.0: 65047.8 i/s - 1.89x slower haml v5.0.0.beta.2: 14363.6 i/s - 8.54x slower
  21. What to measure? • Sometimes a problem has a trade-off

    • trade-off between compilation time and rendering time Rendering of haml/test/templates/standard.haml hamlit v2.0.1: 12351.8 i/s (0.081ms) faml v0.7.0: 9713.4 i/s (0.103ms) - 1.27x slower haml v5.0.0.beta.2: 2296.5 i/s (0.435ms) - 5.38x slower
  22. What to measure? • Sometimes a problem has a trade-off

    • trade-off between compilation time and rendering time Compilation of haml/test/templates/standard.haml haml v5.0.0.beta.2: 388.2 i/s (2.576ms) hamlit v2.0.1: 193.7 i/s (5.163ms) - 2.00x slower faml v0.7.0: 188.0 i/s (5.320ms) - 2.07x slower
  23. • What is a template engine? ✤ How to optimize

    Ruby code? • Benchmark ✤ Profiling • Improvement • What did we do for high performance template engines?
  24. Fundamental Rule of Optimisation • Don't guess, measure • It's

    a waste of time to optimize trivial things • The bottleneck may change at any time
  25. stackprof usage in Hamlit repo • To search the entire

    stack to find the bottlenecks in template compilation $ bin/stackprof test/haml/templates/standard.haml ================================== Mode: wall(1) Samples: 8034 (70.35% miss rate) GC: 787 (9.80%) ================================== TOTAL (pct) SAMPLES (pct) FRAME 498 (6.2%) 498 (6.2%) Temple::Mixins::CompiledDispatcher#disp 893 (11.1%) 319 (4.0%) Ripper::Lexer#lex 2999 (37.3%) 237 (2.9%) Hamlit::HTML#dispatcher 2070 (25.8%) 220 (2.7%) Temple::Filters::ControlFlow#dispatcher 4600 (57.3%) 189 (2.4%) Hamlit::Escapable#dispatcher 164 (2.0%) 164 (2.0%) Temple::Mixins::CompiledDispatcher::Dis 174 (2.2%) 160 (2.0%) block in Temple::ImmutableMap#[]
  26. rblineprof usage in Hamlit repo • To find bottlenecks in

    the compiled template code $ bin/lineprof test/haml/templates/standard.haml [Lineprof] ====================================================================== /private/var/folders/my/syd7zn_d495dmjm7_y8lqby80000gp/T/ compiled20151204-39353-9l8fvy | 16 ; _hamlit_compiler1 = ( 1 + 9 + 8 + 2 #numbers should work and this should be ignored; 0.2ms 200 | 17 ; ); _buf << (::Hamlit::Utils.escape_html(((_hamlit_compiler1).to_s))); _buf << ("\n</div>\n<div id='body'> Quotes should be loved! Just like people!</div>\n".freeze); 57.5ms 100 | 18 ; 120.times do |number|; | 19 ; _hamlit_compiler2 = ( number; 31.5ms 24000 | 20 ; ); _buf <<
  27. • What is a template engine? ✤ How to optimize

    Ruby code? • Benchmark • Profiling ✤ Improvement • What did we do for high performance template engines?
  28. How to improve 1. Don't guess, measure (again) • Profiler

    tells you what to optimize • Benchmark tells you which code is faster
  29. How to improve 3. Learn from others • We'll show

    you examples of template engine optimization
  30. • What is a template engine? • How to optimize

    Ruby code? ✤ What did we do for high performance template engines? ✤ Faml side • Hamlit side
  31. Faml • @eagletmt started faml development as a complete replacement

    of haml • High compatibility with improved performance • Basic ideas for high performance: • Follow Slim • Perform optimization at compile time
  32. Slim's Benchmark Compiled benchmark (i/s) 0 20000 40000 60000 80000

    erb slim ugly haml ugly https://travis-ci.org/slim-template/slim/jobs/94130074#L188-L195
  33. Why does Slim perform well? • Slim uses Temple gem

    as backend • Temple performs generic optimization automatically • I decided to use Temple as backend • https://github.com/judofyr/temple
  34. • Haml generates naive Ruby code Haml %a{ href: 'http://rubykaigi.org/2015'

    } _hamlout.buffer << "<a#{ _hamlout.attributes( {}, nil, href: 'http://rubykaigi.org/2015' ) }></a>\n";
  35. Slim a href='http://rubykaigi.org/2015' _buf = []; _buf << ("<a href=\"http://rubykaigi.org/2015\"></a>".freeze);

    ; _buf = _buf.join • Slim generates a static string literal at compile time
  36. • What is a template engine? • How to optimize

    Ruby code? ✤ What did we do for high performance template engines? ✤ Faml side ✤ Attribute Optimization • Faster Runtime Attribute Builder • Hamlit side
  37. Static Analysis • Haml should also be compiled into static

    string literal like Slim • But Ruby parser is required to achieve it %a{ href: 'http://rubykaigi.org/2015' } %a{ :href=>'http://rubykaigi.org/2015' } %a{ 'href'=>'http://rubykaigi.org/2015' } %a{ 'href': 'http://rubykaigi.org/2015' }
  38. parser gem • https://github.com/whitequark/parser • Ruby parser, used by RuboCop,

    Transpec, ... • Easy to use • AST with rich source code information
  39. Attribute Optimization • Faml categorizes attributes into 3 types by

    parsing Ruby code • Static • Dynamic • Runtime
  40. Static Attribute • Both key and value are static •

    Fastest • No operations in runtime %a{ href: 'http://rubykaigi.org/2015' } <a href='http://rubykaigi.org/2015'></a>
  41. Dynamic Attribute %a{ href: url } • Key is static,

    but the value is dynamic • Relatively fast • Escape url and concat it in runtime <a href='http://rubykaigi.org/2015'></a>
  42. Runtime Attribute • Key and value are dynamic • Slow

    • Build whole attribute list in runtime %a{ key => url } <a href='http://rubykaigi.org/2015'></a>
  43. Line Numbers • We have to keep line numbers •

    for correct backtrace • (for correct __LINE__ value)
  44. • It have to be compiled as runtime attributes Line

    Numbers 1 %a{ class: 'link', 2 href: url } 1 buf << ("<a".freeze); _buf << (::Faml::AttributeBuilder.build("'", true, nil, class: 'link', 2 href: url )); _buf << ("></a>\n".freeze); 3 ; _buf = _buf.join
  45. • What is a template engine? • How to optimize

    Ruby code? ✤ What did we do for high performance template engines? ✤ Faml side • Attribute Optimization ✤ Faster Runtime Attribute Builder • Hamlit side
  46. C extension • C is faster than Ruby! • If

    performance is really important, writing C extension is a good choice.
  47. C extension • I wrote runtime attribute builder in C++

    • Ruby version (before v0.1.0) • 41889.8 i/s • C++ version (v0.7.1) • 90168.6 i/s
  48. • What is a template engine? • How to optimize

    Ruby code? ✤ What did we do for high performance template engines? • Faml side ✤ Hamlit side
  49. Hamlit • Designed to defeat Slim • I've heard many

    people said “migrating from Haml to Slim because it's faster.” • Hamlit means “Haml it” (write it with Haml)
  50. Slim's compiled benchmark with HTML-escaping (i/s) 0 35000 70000 105000

    140000 Hamlit Faml Slim Haml https://travis-ci.org/k0kubun/hamlit/jobs/93928561#L247-L251 Hamlit is faster than Slim
  51. Hamlit’s strategy • Reduce string allocation and concatenation by: •

    compiling string interpolation • dropping unused behaviors
  52. • What is a template engine? • How to optimize

    Ruby code? ✤ What did we do for high performance template engines? • Faml side ✤ Hamlit side ✤ Compiling string interpolation • Dropping unused behaviors
  53. How to compile template? • We should care about: 1.

    String allocation 2. String concatenation
  54. 1. String allocation • Utilize frozen string literal • Thanks

    to Temple::Generator, static string is frozen automatically! • Slim, Faml and Hamlit use this
  55. 2. String concatenation • String interpolation is fast Benchmark.ips do

    |x| x.report("Array#join") { ['hello', 1234].join } x.report("interpolation") { "#{'hello'}#{1234}" } x.compare! end
  56. 2. String concatenation • String interpolation is fast $ ruby

    bench.rb Comparison: interpolation: 1115751.8 i/s Array#join: 507283.5 i/s - 2.20x slower
  57. How should we compile interpolated String? • Suppose that you

    are a Ruby interpreter, what code would be pleasant? - year = 2015 %a{ href: "http://rubykaigi.org/#{year}" } RubyKaigi #{year}
  58. How should we compile interpolated String? year = 2015 _hamlout.buffer

    << "<a#{_hamlout.attributes({}, nil, href: "http://rubykaigi.org/#{year}" )}>#{ "RubyKaigi #{Haml::Helpers.html_escape((year))}" }</a>\n"; Haml - year = 2015 %a{ href: "http://rubykaigi.org/#{year}" } RubyKaigi #{year}
  59. How should we compile interpolated String? year = 2015 _hamlout.buffer

    << "<a#{_hamlout.attributes({}, nil, href: "http://rubykaigi.org/#{year}" )}>#{ "RubyKaigi #{Haml::Helpers.html_escape((year))}" }</a>\n"; Haml - year = 2015 %a{ href: "http://rubykaigi.org/#{year}" } RubyKaigi #{year}
  60. How should we compile interpolated String? _buf = []; year

    = 2015; ; _buf << ("<a".freeze); _faml_html1 = ("http://rubykaigi.org/ #{year}"); case (_faml_html1); when true; _buf << (" href".freeze); when false, nil; else; _buf << (" href='".freeze); _buf << (::Temple::Utils.escape_html((_faml_html1))); _buf << ("'".freeze); end; _buf << (">RubyKaigi ".freeze); _buf << (::Temple::Utils.escape_html((year))); _buf << ("</a>\n".freeze); ; _buf = _buf.join Faml - year = 2015 %a{ href: "http://rubykaigi.org/#{year}" } RubyKaigi #{year}
  61. How should we compile interpolated String? _buf = []; year

    = 2015; ; _buf << ("<a".freeze); _faml_html1 = ("http://rubykaigi.org/ #{year}"); case (_faml_html1); when true; _buf << (" href".freeze); when false, nil; else; _buf << (" href='".freeze); _buf << (::Temple::Utils.escape_html((_faml_html1))); _buf << ("'".freeze); end; _buf << (">RubyKaigi ".freeze); _buf << (::Temple::Utils.escape_html((year))); _buf << ("</a>\n".freeze); ; _buf = _buf.join Faml - year = 2015 %a{ href: "http://rubykaigi.org/#{year}" } RubyKaigi #{year}
  62. How should we compile interpolated String? _buf = []; year

    = 2015; ; _buf << ("<a href='http://rubykaigi.org/".freeze); _buf << (::Hamlit::Utils.escape_html((year))); _buf << ("'>RubyKaigi ".freeze); _buf << (::Hamlit::Utils.escape_html((year))); ; _buf << ("</a>\n".freeze); _buf = _buf.join Hamlit - year = 2015 %a{ href: "http://rubykaigi.org/#{year}" } RubyKaigi #{year}
  63. How should we compile interpolated String? _buf = []; year

    = 2015; ; _buf << ("<a href='http://rubykaigi.org/".freeze); _buf << (::Hamlit::Utils.escape_html((year))); _buf << ("'>RubyKaigi ".freeze); _buf << (::Hamlit::Utils.escape_html((year))); ; _buf << ("</a>\n".freeze); _buf = _buf.join Hamlit - year = 2015 %a{ href: "http://rubykaigi.org/#{year}" } RubyKaigi #{year}
  64. How should we compile interpolated String? Comparison: hamlit v2.0.1: 301640.2

    i/s faml v0.7.1: 199001.5 i/s - 1.52x slower haml v5.0.0.beta.2: 14714.4 i/s - 20.50x slower
  65. Tips to write faster code • Don't allocate string •

    Reduce string concatenation • Fastest way to concatenate string is not to concatenate string
  66. • What is a template engine? • How to optimize

    Ruby code? ✤ What did we do for high performance template engines? • Faml side ✤ Hamlit side • Compiling string interpolation ✤ Dropping unused behaviors
  67. Dropping unused behavior • Since Haml and Slim have rich

    syntax and behaviors in attributes, rendering attributes is a bottleneck • In other words, optimization chance
  68. Dropping unused behavior • To optimize attribute rendering, Faml and

    Hamlit drop some unused behavior • Let's see how they are different!
  69. Dropped features in Hamlit • Hamlit supports following features for

    limited attributes • Data attribute hyphenation • Boolean attribute
  70. Data attribute hyphenation • In Haml, nested Hash is expanded

    with hyphen for all attributes <div foo-bar='baz'></div> <div data-bar='baz'></div> %div{ foo: { bar: 'baz' } } %div{ data: { bar: 'baz' } } Haml
  71. • In Faml and Hamlit, data attribute hyphenation is supported

    only for data attribute Data attribute hyphenation Haml Faml, Hamlit <div foo-bar='baz'></div> <div data-bar='baz'></div> <div foo='{:bar=&gt;&quot;baz&quot;}'></div> <div data-bar='baz'></div> %div{ foo: { bar: 'baz' } } %div{ data: { bar: 'baz' } }
  72. Data attribute hyphenation • Hyphenating data attribute is expensive •

    So we dropped it to generate faster code in non-data attributes
  73. ; _buf << ("<input".freeze); case ((_hamlit_compiler1 = (disabled))); when true;

    _buf << (" disabled".freeze); when false, nil; else; _buf << (" disabled='".freeze); _buf << (::Hamlit::Utils.escape_html((_hamlit_compiler1))); _buf << ("'".freeze); end ; _buf << (">\n".freeze); Data attribute hyphenation • No code to hyphenate Hash
  74. Data attribute hyphenation https://travis-ci.org/k0kubun/hamlit/jobs/96207038#L257-L260 - disabled = false %input{ disabled:

    disabled } - disabled = true %input{ disabled: disabled } Comparison: hamlit v2.0.1: 819212.4 i/s (0.001ms) faml v0.7.1: 614993.4 i/s (0.002ms) - 1.33x slower haml v5.0.0.beta.2: 15073.2 i/s (0.066ms) - 54.35x slower • Benchmark for non-data attribute
  75. Boolean support • Only with Hamlit, non-boolean attributes are not

    deleted by falsey values (nil, false) Haml, Faml Hamlit %a{ href: false } %a{ disabled: false } <a></a> <a></a> <a href=''></a> <a></a>
  76. Boolean support • Only with Hamlit, non-boolean attributes are not

    deleted by falsey values (nil, false) • It means that Hamlit doesn't need to check and concatenate value on runtime for non- boolean attributes
  77. Boolean support _buf = []; url = 'http://rubykaigi.org/2015'; ; _buf

    << ("<a".freeze); _faml_html1 = (url); case (_faml_html1); when true; _buf << (" href".freeze); when false, nil; else; _buf << (" href='".freeze); _buf << (::Temple::Utils.escape_html((_faml_html1))); _buf << ("'".freeze); end; _buf << ("></a>\n".freeze); • Faml compilation for non-boolean attribute
  78. Boolean support _buf = []; url = 'http://rubykaigi.org/2015'; ; _buf

    << ("<a href='".freeze); _buf << (::Hamlit::Utils.escape_html((url))); _buf << ("'></a>\n".freeze); _buf = _buf.join • Hamlit compilation for non-boolean attribute
  79. Comparison: hamlit v2.0.1: 407851.9 i/s (0.002ms) faml v0.7.1: 223612.4 i/s

    (0.004ms) - 1.82x slower haml v5.0.0.beta.2: 21823.1 i/s (0.046ms) - 18.69x slower Boolean support
  80. But does it really work? • Also in Rails tag

    helpers, false is not deleted for non-boolean attributes = content_tag :input, '', value: false <input value='false'></input>
  81. • It could pass 20,000+ tests in the World's largest

    Rails application! But does it really work? https://speakerdeck.com/a_matsuda/the-recipe-for-the-worlds-largest-rails-monolith
  82. Why Hamlit is the fastest? • Faml and Slim has

    boolean support for all attributes • So Hamlit is faster in non-boolean attributes • Give up trivial things to make things better!
  83. Comparison of Haml engines • Haml • Slow and rarely

    maintained now • I sent a patch to replace backend, but not merged • Faml • Fast and highly compatible • Hamlit • Fastest and slightly incompatible
  84. Conclusion • How to improve performance • Benchmark, Profiling, Improvement

    • Real examples of improvements • Faml and Hamlit • Try our faster Haml engines!