Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Ruby as a Glue Language

jeg2
September 07, 2007

Ruby as a Glue Language

An attempt to defend the merits of "shelling out" in Ruby code.

jeg2

September 07, 2007
Tweet

More Decks by jeg2

Other Decks in Technology

Transcript

  1. Ruby as a
    Glue Language
    Claiming Your Super Powers

    View full-size slide

  2. James Edward
    Gray II

    View full-size slide

  3. James Edward
    Gray II
    ‣I run the Ruby Quiz

    View full-size slide

  4. James Edward
    Gray II
    ‣I run the Ruby Quiz
    ‣I wrote some open source libraries
    ‣FasterCSV
    ‣HighLine

    View full-size slide

  5. James Edward
    Gray II
    ‣I run the Ruby Quiz
    ‣I wrote some open source libraries
    ‣FasterCSV
    ‣HighLine
    ‣I authored a couple of Pragmatic books
    with Ruby in them

    View full-size slide

  6. James Edward
    Gray II
    ‣I run the Ruby Quiz
    ‣I wrote some open source libraries
    ‣FasterCSV
    ‣HighLine
    ‣I authored a couple of Pragmatic books
    with Ruby in them
    ‣I maintain the Ruby bundle for TextMate

    View full-size slide

  7. What is Heroes?

    View full-size slide

  8. What is Heroes?
    ‣A weekly TV
    show on NBC

    View full-size slide

  9. What is Heroes?
    ‣A weekly TV
    show on NBC
    ‣The premise is
    that a few
    ordinary people
    realize they
    have super
    powers

    View full-size slide

  10. Good Programmers
    are Heroes

    View full-size slide

  11. Good Programmers
    are Heroes
    ‣They are seemingly ordinary people

    View full-size slide

  12. Good Programmers
    are Heroes
    ‣They are seemingly ordinary people
    ‣They constantly do what seems impossible
    ‣They use their super powers

    View full-size slide

  13. Ruby Makes A
    Great Sidekick

    View full-size slide

  14. Ruby Makes A
    Great Sidekick
    ‣Ruby has many
    powers of her own

    View full-size slide

  15. Ruby Makes A
    Great Sidekick
    ‣Ruby has many
    powers of her own
    ‣Including the much
    desired power to
    borrow the powers
    of others

    View full-size slide

  16. Ruby Glue
    Good or bad?

    View full-size slide

  17. Glue Languages

    View full-size slide

  18. Glue Languages
    ‣A design goal of Perl was to make it a
    good “glue language”
    ‣Glue languages are used to join a set of
    external tools together to get work done

    View full-size slide

  19. Glue Languages
    ‣A design goal of Perl was to make it a
    good “glue language”
    ‣Glue languages are used to join a set of
    external tools together to get work done
    ‣Ruby copied this Perlism

    View full-size slide

  20. Evil Experts

    View full-size slide

  21. Evil Experts
    ‣Multiple books warn
    programmers away
    from glue features

    View full-size slide

  22. Evil Experts
    ‣Multiple books warn
    programmers away
    from glue features
    ‣Experts claim
    ‣Using these features
    hurts portability
    ‣Using these features
    adds failure points

    View full-size slide

  23. I Have a
    Super Power

    View full-size slide

  24. I Have a
    Super Power
    ‣I’m immune to the word “can’t”

    View full-size slide

  25. I Have a
    Super Power
    ‣I’m immune to the word “can’t”
    ‣We, as an industry, sometimes struggle
    with that word

    View full-size slide

  26. I Have a
    Super Power
    ‣I’m immune to the word “can’t”
    ‣We, as an industry, sometimes struggle
    with that word
    ‣MJD once said: Programming is a young
    field and when alchemy was as young as
    we are now, they were still trying to
    turn lead into gold

    View full-size slide

  27. My Opinion of the
    Expert Advice

    View full-size slide

  28. BS
    My Opinion of the
    Expert Advice

    View full-size slide

  29. We May not Need/
    Want Portability

    View full-size slide

  30. We May not Need/
    Want Portability
    ‣If we know where the code will run,
    there’s no problem
    ‣TextMate uses Mac OS X glue code
    ‣Rails applications deployed to a company
    server have a known platform

    View full-size slide

  31. We May not Need/
    Want Portability
    ‣If we know where the code will run,
    there’s no problem
    ‣TextMate uses Mac OS X glue code
    ‣Rails applications deployed to a company
    server have a known platform
    ‣We may be accessing platform specific
    features like AppleScript, Spotlight, or
    Plist API’s

    View full-size slide

  32. Libraries Fail Too

    View full-size slide

  33. Libraries Fail Too
    ‣C extensions can have non-trivial or
    non-portable installs
    ‣Dependencies make this even worse

    View full-size slide

  34. Libraries Fail Too
    ‣C extensions can have non-trivial or
    non-portable installs
    ‣Dependencies make this even worse
    ‣Libraries throw errors you must handle
    as well

    View full-size slide

  35. Counter Argument:
    It’s Fast!

    View full-size slide

  36. Counter Argument:
    It’s Fast!
    ‣At work, I investigated options for an
    HTML to PDF conversion job
    ‣The Good Way: PDF::Writer
    ‣The Evil Way: wrap `html2ps | ps2pdf`

    View full-size slide

  37. Counter Argument:
    It’s Fast!
    ‣At work, I investigated options for an
    HTML to PDF conversion job
    ‣The Good Way: PDF::Writer
    ‣The Evil Way: wrap `html2ps | ps2pdf`
    ‣I gave each approach three hours of my
    time
    ‣I estimated PDF::Writer would take weeks
    ‣I basically finished the job with glue code

    View full-size slide

  38. Shelling Out
    Using Backticks

    View full-size slide

  39. Example:
    A Unique ID

    View full-size slide

  40. Example:
    A Unique ID
    ‣A common need

    View full-size slide

  41. Example:
    A Unique ID
    ‣A common need
    ‣Asked a lot on Ruby
    Talk
    ‣The last thread
    included ideas from a
    lot of smart people

    View full-size slide

  42. Example:
    A Unique ID
    ‣A common need
    ‣Asked a lot on Ruby
    Talk
    ‣The last thread
    included ideas from a
    lot of smart people
    ‣There are multiple
    Libraries for this

    View full-size slide

  43. A UUID from
    Glue Code

    View full-size slide

  44. A UUID from
    Glue Code
    id = `uuidgen`

    View full-size slide

  45. Alternate Syntax

    View full-size slide

  46. Alternate Syntax
    id = %x{uuidgen}
    id = %x@uuidgen@

    View full-size slide

  47. Alternate Syntax
    ‣Use this syntax when
    you need backticks in
    your command
    ‣any symbol can be a
    delimiter
    id = %x{uuidgen}
    id = %x@uuidgen@

    View full-size slide

  48. Alternate Syntax
    ‣Use this syntax when
    you need backticks in
    your command
    ‣any symbol can be a
    delimiter
    ‣You can also use the
    matching pairs: (…),
    […], {…}, and <…>
    ‣These nest properly
    id = %x{uuidgen}
    id = %x@uuidgen@

    View full-size slide

  49. No Output
    Needed
    Using system()

    View full-size slide

  50. Example:
    The Pasteboard

    View full-size slide

  51. Example:
    The Pasteboard
    ‣I want to put a search string on OS X’s
    find “pasteboard” (clipboard)

    View full-size slide

  52. Example:
    The Pasteboard
    ‣I want to put a search string on OS X’s
    find “pasteboard” (clipboard)
    ‣I don’t need any output for this
    operation

    View full-size slide

  53. Example:
    The Pasteboard
    ‣I want to put a search string on OS X’s
    find “pasteboard” (clipboard)
    ‣I don’t need any output for this
    operation
    ‣I just need to know if the operation
    succeeded
    ‣A simple true or false will do

    View full-size slide

  54. Ran or Didn’t Run

    View full-size slide

  55. Ran or Didn’t Run
    if system "pbcopy -pboard find <<< 'New Search String'"
    puts "Search string set."
    else
    puts "Could not search string."
    end

    View full-size slide

  56. Shell Expansion

    View full-size slide

  57. Shell Expansion
    ENV["MY_VAR"] = "Set from Ruby"
    !
    system "echo $MY_VAR"
    # >> Set from Ruby
    !
    system "echo", "$MY_VAR"
    # >> $MY_VAR

    View full-size slide

  58. Shell Expansion
    ENV["MY_VAR"] = "Set from Ruby"
    !
    system "echo $MY_VAR"
    # >> Set from Ruby
    !
    system "echo", "$MY_VAR"
    # >> $MY_VAR
    ‣A single argument
    goes through shell
    expansion
    ‣File glob patterns
    ‣Environment variables

    View full-size slide

  59. Shell Expansion
    ENV["MY_VAR"] = "Set from Ruby"
    !
    system "echo $MY_VAR"
    # >> Set from Ruby
    !
    system "echo", "$MY_VAR"
    # >> $MY_VAR
    ‣A single argument
    goes through shell
    expansion
    ‣File glob patterns
    ‣Environment variables
    ‣Multiple arguments
    are passed without
    going through
    expansion

    View full-size slide

  60. Handling
    Errors
    Mind the Expert Warnings

    View full-size slide

  61. When Trouble
    strikes

    View full-size slide

  62. When Trouble
    strikes
    ‣Remember to handle
    STDERR

    View full-size slide

  63. When Trouble
    strikes
    ‣Remember to handle
    STDERR
    ‣Check process exit
    status

    View full-size slide

  64. When Trouble
    strikes
    ‣Remember to handle
    STDERR
    ‣Check process exit
    status
    ‣Use popen3() when
    things get
    complicated

    View full-size slide

  65. Example:
    Backups

    View full-size slide

  66. Example:
    Backups
    ‣I want to backup a directory as part of
    a larger automation

    View full-size slide

  67. Example:
    Backups
    ‣I want to backup a directory as part of
    a larger automation
    ‣The rsync program can do what I need

    View full-size slide

  68. Example:
    Backups
    ‣I want to backup a directory as part of
    a larger automation
    ‣The rsync program can do what I need
    ‣I need to watch for problems and
    handle them gracefully
    ‣Possibly emailing a warning to the user

    View full-size slide

  69. STDERR,
    The Problem Child

    View full-size slide

  70. STDERR,
    The Problem Child

    View full-size slide

  71. STDERR,
    The Problem Child

    View full-size slide

  72. STDERR,
    The Problem Child

    View full-size slide

  73. STDERR,
    The Problem Child

    View full-size slide

  74. STDERR,
    The Problem Child

    View full-size slide

  75. Taming STDERR

    View full-size slide

  76. Taming STDERR
    dir = ARGV.shift or
    abort "USAGE: #{File.basename($PROGRAM_NAME)} DIR"
    results = `rsync -av --exclude '*.DS_Store' #{dir} #{dir}_backup 2>&1`
    if $?.success? # require "English"; $CHILD_STATUS.success?
    puts results.grep(/\A#{Regexp.escape(dir)}/)
    else
    puts "Error: Couldn't back up #{dir}"
    # …
    end

    View full-size slide

  77. Taming STDERR
    dir = ARGV.shift or
    abort "USAGE: #{File.basename($PROGRAM_NAME)} DIR"
    results = `rsync -av --exclude '*.DS_Store' #{dir} #{dir}_backup 2>&1`
    if $?.success? # require "English"; $CHILD_STATUS.success?
    puts results.grep(/\A#{Regexp.escape(dir)}/)
    else
    puts "Error: Couldn't back up #{dir}"
    # …
    end

    View full-size slide

  78. Taming STDERR
    dir = ARGV.shift or
    abort "USAGE: #{File.basename($PROGRAM_NAME)} DIR"
    results = `rsync -av --exclude '*.DS_Store' #{dir} #{dir}_backup 2>&1`
    if $?.success? # require "English"; $CHILD_STATUS.success?
    puts results.grep(/\A#{Regexp.escape(dir)}/)
    else
    puts "Error: Couldn't back up #{dir}"
    # …
    end

    View full-size slide

  79. Taming STDERR
    dir = ARGV.shift or
    abort "USAGE: #{File.basename($PROGRAM_NAME)} DIR"
    results = `rsync -av --exclude '*.DS_Store' #{dir} #{dir}_backup 2>&1`
    if $?.success? # require "English"; $CHILD_STATUS.success?
    puts results.grep(/\A#{Regexp.escape(dir)}/)
    else
    puts "Error: Couldn't back up #{dir}"
    # …
    end

    View full-size slide

  80. Proper Shell
    Escaping

    View full-size slide

  81. Proper Shell
    Escaping
    # escape text to make it useable in a shell script as
    # one “word” (string)
    def escape_for_shell(str)
    str.to_s.gsub( /(?=[^a-zA-Z0-9_.\/\-\x7F-\xFF\n])/, '\\' ).
    gsub( /\n/, "'\n'" ).
    sub( /^$/, "''" )
    end

    View full-size slide

  82. Tips for
    Avoiding Errors

    View full-size slide

  83. Tips for
    Avoiding Errors
    ‣Use full paths to programs and files
    whenever possible

    View full-size slide

  84. Tips for
    Avoiding Errors
    ‣Use full paths to programs and files
    whenever possible
    ‣Send data to STDIN when you can

    View full-size slide

  85. Tips for
    Avoiding Errors
    ‣Use full paths to programs and files
    whenever possible
    ‣Send data to STDIN when you can
    ‣If you can’t send it to STDIN, dump the
    data to a Tempfile and send that path

    View full-size slide

  86. Tips for
    Avoiding Errors
    ‣Use full paths to programs and files
    whenever possible
    ‣Send data to STDIN when you can
    ‣If you can’t send it to STDIN, dump the
    data to a Tempfile and send that path
    ‣Remember to shell escape any command-
    line arguments that could contain
    dangerous characters (even spaces)

    View full-size slide

  87. Full Control
    Using popen(), popen3(),
    and popen4()

    View full-size slide

  88. Managing Streams

    View full-size slide

  89. Managing Streams
    ‣Use popen() to
    manage STDIN and
    STDOUT

    View full-size slide

  90. Managing Streams
    ‣Use popen() to
    manage STDIN and
    STDOUT
    ‣Use popen3() to
    manage STDIN,
    STDOUT, and
    STDERR
    ‣Use popen4() if you also
    need the PID

    View full-size slide

  91. Example:
    Formatting Prose

    View full-size slide

  92. Example:
    Formatting Prose
    ‣I want to rewrap some prose provided by
    the user

    View full-size slide

  93. Example:
    Formatting Prose
    ‣I want to rewrap some prose provided by
    the user
    ‣Command-line arguments are not
    appropriate here
    ‣Complex shell Escaping
    ‣Size limit

    View full-size slide

  94. Example:
    Formatting Prose
    ‣I want to rewrap some prose provided by
    the user
    ‣Command-line arguments are not
    appropriate here
    ‣Complex shell Escaping
    ‣Size limit
    ‣I need to send the prose to fmt via
    STDIN

    View full-size slide

  95. Reading and
    Writing

    View full-size slide

  96. Reading and
    Writing
    prose = <Lorem ipsum dolor sit amet, consectetur adipisicing elit,
    sed do eiusmod tempor incididunt ut labore et dolore magna
    aliqua. Ut enim ad minim veniam, quis nostrud exercitation
    ullamco laboris nisi ut aliquip ex ea commodo consequat.
    Duis aute irure dolor in reprehenderit in voluptate velit
    esse cillum dolore eu fugiat nulla pariatur. Excepteur
    sint occaecat cupidatat non proident, sunt in culpa qui
    officia deserunt mollit anim id est laborum.
    END_PROSE
    !
    formatted = IO.popen("fmt -w 30", "r+") do |pipe|
    # open("| fmt -w 30", "r+") do |pipe|
    pipe << prose
    pipe.close_write
    pipe.read
    end

    View full-size slide

  97. Reading and
    Writing
    prose = <Lorem ipsum dolor sit amet, consectetur adipisicing elit,
    sed do eiusmod tempor incididunt ut labore et dolore magna
    aliqua. Ut enim ad minim veniam, quis nostrud exercitation
    ullamco laboris nisi ut aliquip ex ea commodo consequat.
    Duis aute irure dolor in reprehenderit in voluptate velit
    esse cillum dolore eu fugiat nulla pariatur. Excepteur
    sint occaecat cupidatat non proident, sunt in culpa qui
    officia deserunt mollit anim id est laborum.
    END_PROSE
    !
    formatted = IO.popen("fmt -w 30", "r+") do |pipe|
    # open("| fmt -w 30", "r+") do |pipe|
    pipe << prose
    pipe.close_write
    pipe.read
    end

    View full-size slide

  98. Reading and
    Writing
    prose = <Lorem ipsum dolor sit amet, consectetur adipisicing elit,
    sed do eiusmod tempor incididunt ut labore et dolore magna
    aliqua. Ut enim ad minim veniam, quis nostrud exercitation
    ullamco laboris nisi ut aliquip ex ea commodo consequat.
    Duis aute irure dolor in reprehenderit in voluptate velit
    esse cillum dolore eu fugiat nulla pariatur. Excepteur
    sint occaecat cupidatat non proident, sunt in culpa qui
    officia deserunt mollit anim id est laborum.
    END_PROSE
    !
    formatted = IO.popen("fmt -w 30", "r+") do |pipe|
    # open("| fmt -w 30", "r+") do |pipe|
    pipe << prose
    pipe.close_write
    pipe.read
    end

    View full-size slide

  99. Reading and
    Writing
    prose = <Lorem ipsum dolor sit amet, consectetur adipisicing elit,
    sed do eiusmod tempor incididunt ut labore et dolore magna
    aliqua. Ut enim ad minim veniam, quis nostrud exercitation
    ullamco laboris nisi ut aliquip ex ea commodo consequat.
    Duis aute irure dolor in reprehenderit in voluptate velit
    esse cillum dolore eu fugiat nulla pariatur. Excepteur
    sint occaecat cupidatat non proident, sunt in culpa qui
    officia deserunt mollit anim id est laborum.
    END_PROSE
    !
    formatted = IO.popen("fmt -w 30", "r+") do |pipe|
    # open("| fmt -w 30", "r+") do |pipe|
    pipe << prose
    pipe.close_write
    pipe.read
    end

    View full-size slide

  100. Reading and
    Writing
    prose = <Lorem ipsum dolor sit amet, consectetur adipisicing elit,
    sed do eiusmod tempor incididunt ut labore et dolore magna
    aliqua. Ut enim ad minim veniam, quis nostrud exercitation
    ullamco laboris nisi ut aliquip ex ea commodo consequat.
    Duis aute irure dolor in reprehenderit in voluptate velit
    esse cillum dolore eu fugiat nulla pariatur. Excepteur
    sint occaecat cupidatat non proident, sunt in culpa qui
    officia deserunt mollit anim id est laborum.
    END_PROSE
    !
    formatted = IO.popen("fmt -w 30", "r+") do |pipe|
    # open("| fmt -w 30", "r+") do |pipe|
    pipe << prose
    pipe.close_write
    pipe.read
    end

    View full-size slide

  101. Example:
    A Ruby Session

    View full-size slide

  102. Example:
    A Ruby Session
    ‣I want to run some Ruby code

    View full-size slide

  103. Example:
    A Ruby Session
    ‣I want to run some Ruby code
    ‣I don’t want that code to affect my
    current Ruby process

    View full-size slide

  104. Example:
    A Ruby Session
    ‣I want to run some Ruby code
    ‣I don’t want that code to affect my
    current Ruby process
    ‣I may also need to do some special
    setup, hacking Ruby’s core, before this
    code is run

    View full-size slide

  105. Example:
    A Ruby Session
    ‣I want to run some Ruby code
    ‣I don’t want that code to affect my
    current Ruby process
    ‣I may also need to do some special
    setup, hacking Ruby’s core, before this
    code is run
    ‣I need to format STDOUT and STDERR
    differently

    View full-size slide

  106. With Error
    Handling

    View full-size slide

  107. With Error
    Handling
    require "open3"
    !
    Open3.popen3("ruby") do |stdin, stdout, stderr|
    stdin << %Q{puts "I am a puppet."; oops!()}
    stdin.close
    puts "Output:"
    puts stdout.read
    puts "Errors:"
    puts stderr.read
    end

    View full-size slide

  108. With Error
    Handling
    require "open3"
    !
    Open3.popen3("ruby") do |stdin, stdout, stderr|
    stdin << %Q{puts "I am a puppet."; oops!()}
    stdin.close
    puts "Output:"
    puts stdout.read
    puts "Errors:"
    puts stderr.read
    end

    View full-size slide

  109. With Error
    Handling
    require "open3"
    !
    Open3.popen3("ruby") do |stdin, stdout, stderr|
    stdin << %Q{puts "I am a puppet."; oops!()}
    stdin.close
    puts "Output:"
    puts stdout.read
    puts "Errors:"
    puts stderr.read
    end

    View full-size slide

  110. With Error
    Handling
    require "open3"
    !
    Open3.popen3("ruby") do |stdin, stdout, stderr|
    stdin << %Q{puts "I am a puppet."; oops!()}
    stdin.close
    puts "Output:"
    puts stdout.read
    puts "Errors:"
    puts stderr.read
    end

    View full-size slide

  111. When you Also
    Need a PID

    View full-size slide

  112. When you Also
    Need a PID
    ‣Install the POpen4 gem
    ‣Unix version
    ‣Windows versions

    View full-size slide

  113. When you Also
    Need a PID
    ‣Install the POpen4 gem
    ‣Unix version
    ‣Windows versions
    ‣popen4() works like popen3() but it also
    passes you the PID for the child
    process
    ‣The PID is useful for sending the child process
    signals, possibly to kill the process

    View full-size slide

  114. “If it’s on the
    Web, it has
    an API.”
    !
    James Britt
    Using Web Tools

    View full-size slide

  115. Don’t Forget
    the Web

    View full-size slide

  116. Don’t Forget
    the Web
    If you need to… Use the tool…

    View full-size slide

  117. Don’t Forget
    the Web
    If you need to… Use the tool…
    Read Content open-uri

    View full-size slide

  118. Don’t Forget
    the Web
    If you need to… Use the tool…
    Read Content open-uri
    Write Form Data Net::HTTP

    View full-size slide

  119. Don’t Forget
    the Web
    If you need to… Use the tool…
    Read Content open-uri
    Write Form Data Net::HTTP
    Emulate a Browser Mechanize

    View full-size slide

  120. Don’t Forget
    the Web
    If you need to… Use the tool…
    Read Content open-uri
    Write Form Data Net::HTTP
    Emulate a Browser Mechanize
    Scrape HTML Hpricot

    View full-size slide

  121. Example:
    Tracking Ruby

    View full-size slide

  122. Example:
    Tracking Ruby
    ‣I want to download
    the latest version
    of Ruby as part of a
    larger automation

    View full-size slide

  123. Example:
    Tracking Ruby
    ‣I want to download
    the latest version
    of Ruby as part of a
    larger automation
    ‣I want to verify the
    contents of the
    download

    View full-size slide

  124. Example:
    Tracking Ruby
    ‣I want to download
    the latest version
    of Ruby as part of a
    larger automation
    ‣I want to verify the
    contents of the
    download
    ‣I want to expand the
    compressed archive

    View full-size slide

  125. Simple Scraping

    View full-size slide

  126. Simple Scraping
    require "open-uri"
    require "digest/md5"
    !
    require "rubygems"
    require "hpricot"
    !
    dl = Hpricot(open("http://www.ruby-lang.org/en/downloads/"))
    li = (dl / "div#content" / "ul" / "li").first
    url = (li / "a").first.attributes["href"]
    md5 = li.inner_html[/md5:.+?([A-Za-z0-9]{32})/, 1]
    !
    rb = open(url) { |ftp| ftp.read }
    if Digest::MD5.hexdigest(rb) == md5
    IO.popen("tar xvz", "wb") { |tar| tar << rb }
    else
    abort "Corrupt download"
    end

    View full-size slide

  127. Simple Scraping
    require "open-uri"
    require "digest/md5"
    !
    require "rubygems"
    require "hpricot"
    !
    dl = Hpricot(open("http://www.ruby-lang.org/en/downloads/"))
    li = (dl / "div#content" / "ul" / "li").first
    url = (li / "a").first.attributes["href"]
    md5 = li.inner_html[/md5:.+?([A-Za-z0-9]{32})/, 1]
    !
    rb = open(url) { |ftp| ftp.read }
    if Digest::MD5.hexdigest(rb) == md5
    IO.popen("tar xvz", "wb") { |tar| tar << rb }
    else
    abort "Corrupt download"
    end

    View full-size slide

  128. Simple Scraping
    require "open-uri"
    require "digest/md5"
    !
    require "rubygems"
    require "hpricot"
    !
    dl = Hpricot(open("http://www.ruby-lang.org/en/downloads/"))
    li = (dl / "div#content" / "ul" / "li").first
    url = (li / "a").first.attributes["href"]
    md5 = li.inner_html[/md5:.+?([A-Za-z0-9]{32})/, 1]
    !
    rb = open(url) { |ftp| ftp.read }
    if Digest::MD5.hexdigest(rb) == md5
    IO.popen("tar xvz", "wb") { |tar| tar << rb }
    else
    abort "Corrupt download"
    end

    View full-size slide

  129. Simple Scraping
    require "open-uri"
    require "digest/md5"
    !
    require "rubygems"
    require "hpricot"
    !
    dl = Hpricot(open("http://www.ruby-lang.org/en/downloads/"))
    li = (dl / "div#content" / "ul" / "li").first
    url = (li / "a").first.attributes["href"]
    md5 = li.inner_html[/md5:.+?([A-Za-z0-9]{32})/, 1]
    !
    rb = open(url) { |ftp| ftp.read }
    if Digest::MD5.hexdigest(rb) == md5
    IO.popen("tar xvz", "wb") { |tar| tar << rb }
    else
    abort "Corrupt download"
    end

    View full-size slide

  130. Use Caution
    ‣These scraping techniques see wider use
    than talking to external processes

    View full-size slide

  131. Use Caution
    ‣These scraping techniques see wider use
    than talking to external processes
    ‣Ironically, they really do seem to be
    more fragile

    View full-size slide

  132. Use Caution
    ‣These scraping techniques see wider use
    than talking to external processes
    ‣Ironically, they really do seem to be
    more fragile
    ‣tips for managing scraping code:
    ‣Abstract out the scraping code
    ‣Use more aggressive error handling
    ‣Make sure the maintenance is worth it

    View full-size slide

  133. Summary
    Remain Strong

    View full-size slide

  134. Pop Quiz
    Out of the box, can Ruby…

    View full-size slide

  135. Pop Quiz
    Out of the box, can Ruby…
    Apply a difference algorithm to the
    contents of two Strings?

    View full-size slide

  136. Pop Quiz
    Out of the box, can Ruby…
    Apply a difference algorithm to the
    contents of two Strings?
    Efficiently read a file
    line by line in reverse?

    View full-size slide

  137. YES!
    ‣Don’t be afraid
    to use your
    powers

    View full-size slide

  138. YES!
    ‣Don’t be afraid
    to use your
    powers
    ‣You will
    literally be able
    to accomplish
    anything

    View full-size slide

  139. String Diff
    require "tempfile"
    !
    class String
    def diff(other)
    st = Tempfile.new("diff_self")
    ot = Tempfile.new("diff_other")
    st << self
    ot << other
    [st, ot].each { |t| t.flush }
    `diff -u #{st.path} #{ot.path}`[/^@.+\z/m]
    end
    end
    !
    puts "one\ntwo\n".diff("one\nthree\n")
    # >> @@ -1,2 +1,2 @@
    # >> one
    # >> -two
    # >> +three

    View full-size slide

  140. Reading
    Backwards

    View full-size slide

  141. Reading
    Backwards
    unless ARGV.size == 1 and File.exist? ARGV.first
    abort "Usage: #{File.basename($PROGRAM_NAME)} FILE"
    end
    !
    last_five_lines = Array.new
    !
    IO.popen("tail -r #{ARGV.shift}") do |tail|
    tail.each do |line|
    last_five_lines << line
    break if last_five_lines.size == 5
    end
    end
    last_five_lines.reverse!
    !
    puts last_five_lines

    View full-size slide