Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Ruby as a Glue Language

jeg2
September 07, 2007

Ruby as a Glue Language

An attempt to defend the merits of "shelling out" in Ruby code.

jeg2

September 07, 2007
Tweet

More Decks by jeg2

Other Decks in Technology

Transcript

  1. Ruby as a
    Glue Language
    Claiming Your Super Powers

    View Slide

  2. James Edward
    Gray II

    View Slide

  3. James Edward
    Gray II
    ‣I run the Ruby Quiz

    View Slide

  4. James Edward
    Gray II
    ‣I run the Ruby Quiz
    ‣I wrote some open source libraries
    ‣FasterCSV
    ‣HighLine

    View Slide

  5. James Edward
    Gray II
    ‣I run the Ruby Quiz
    ‣I wrote some open source libraries
    ‣FasterCSV
    ‣HighLine
    ‣I authored a couple of Pragmatic books
    with Ruby in them

    View Slide

  6. James Edward
    Gray II
    ‣I run the Ruby Quiz
    ‣I wrote some open source libraries
    ‣FasterCSV
    ‣HighLine
    ‣I authored a couple of Pragmatic books
    with Ruby in them
    ‣I maintain the Ruby bundle for TextMate

    View Slide

  7. View Slide

  8. View Slide

  9. What is Heroes?

    View Slide

  10. What is Heroes?
    ‣A weekly TV
    show on NBC

    View Slide

  11. What is Heroes?
    ‣A weekly TV
    show on NBC
    ‣The premise is
    that a few
    ordinary people
    realize they
    have super
    powers

    View Slide

  12. Good Programmers
    are Heroes

    View Slide

  13. Good Programmers
    are Heroes
    ‣They are seemingly ordinary people

    View Slide

  14. Good Programmers
    are Heroes
    ‣They are seemingly ordinary people
    ‣They constantly do what seems impossible
    ‣They use their super powers

    View Slide

  15. Ruby Makes A
    Great Sidekick

    View Slide

  16. Ruby Makes A
    Great Sidekick
    ‣Ruby has many
    powers of her own

    View Slide

  17. Ruby Makes A
    Great Sidekick
    ‣Ruby has many
    powers of her own
    ‣Including the much
    desired power to
    borrow the powers
    of others

    View Slide

  18. Ruby Glue
    Good or bad?

    View Slide

  19. Glue Languages

    View Slide

  20. Glue Languages
    ‣A design goal of Perl was to make it a
    good “glue language”
    ‣Glue languages are used to join a set of
    external tools together to get work done

    View Slide

  21. Glue Languages
    ‣A design goal of Perl was to make it a
    good “glue language”
    ‣Glue languages are used to join a set of
    external tools together to get work done
    ‣Ruby copied this Perlism

    View Slide

  22. Evil Experts

    View Slide

  23. Evil Experts
    ‣Multiple books warn
    programmers away
    from glue features

    View Slide

  24. Evil Experts
    ‣Multiple books warn
    programmers away
    from glue features
    ‣Experts claim
    ‣Using these features
    hurts portability
    ‣Using these features
    adds failure points

    View Slide

  25. I Have a
    Super Power

    View Slide

  26. I Have a
    Super Power
    ‣I’m immune to the word “can’t”

    View Slide

  27. I Have a
    Super Power
    ‣I’m immune to the word “can’t”
    ‣We, as an industry, sometimes struggle
    with that word

    View Slide

  28. I Have a
    Super Power
    ‣I’m immune to the word “can’t”
    ‣We, as an industry, sometimes struggle
    with that word
    ‣MJD once said: Programming is a young
    field and when alchemy was as young as
    we are now, they were still trying to
    turn lead into gold

    View Slide

  29. My Opinion of the
    Expert Advice

    View Slide

  30. BS
    My Opinion of the
    Expert Advice

    View Slide

  31. We May not Need/
    Want Portability

    View Slide

  32. We May not Need/
    Want Portability
    ‣If we know where the code will run,
    there’s no problem
    ‣TextMate uses Mac OS X glue code
    ‣Rails applications deployed to a company
    server have a known platform

    View Slide

  33. We May not Need/
    Want Portability
    ‣If we know where the code will run,
    there’s no problem
    ‣TextMate uses Mac OS X glue code
    ‣Rails applications deployed to a company
    server have a known platform
    ‣We may be accessing platform specific
    features like AppleScript, Spotlight, or
    Plist API’s

    View Slide

  34. Libraries Fail Too

    View Slide

  35. Libraries Fail Too
    ‣C extensions can have non-trivial or
    non-portable installs
    ‣Dependencies make this even worse

    View Slide

  36. Libraries Fail Too
    ‣C extensions can have non-trivial or
    non-portable installs
    ‣Dependencies make this even worse
    ‣Libraries throw errors you must handle
    as well

    View Slide

  37. Counter Argument:
    It’s Fast!

    View Slide

  38. Counter Argument:
    It’s Fast!
    ‣At work, I investigated options for an
    HTML to PDF conversion job
    ‣The Good Way: PDF::Writer
    ‣The Evil Way: wrap `html2ps | ps2pdf`

    View Slide

  39. Counter Argument:
    It’s Fast!
    ‣At work, I investigated options for an
    HTML to PDF conversion job
    ‣The Good Way: PDF::Writer
    ‣The Evil Way: wrap `html2ps | ps2pdf`
    ‣I gave each approach three hours of my
    time
    ‣I estimated PDF::Writer would take weeks
    ‣I basically finished the job with glue code

    View Slide

  40. Shelling Out
    Using Backticks

    View Slide

  41. Example:
    A Unique ID

    View Slide

  42. Example:
    A Unique ID
    ‣A common need

    View Slide

  43. Example:
    A Unique ID
    ‣A common need
    ‣Asked a lot on Ruby
    Talk
    ‣The last thread
    included ideas from a
    lot of smart people

    View Slide

  44. Example:
    A Unique ID
    ‣A common need
    ‣Asked a lot on Ruby
    Talk
    ‣The last thread
    included ideas from a
    lot of smart people
    ‣There are multiple
    Libraries for this

    View Slide

  45. A UUID from
    Glue Code

    View Slide

  46. A UUID from
    Glue Code
    id = `uuidgen`

    View Slide

  47. Alternate Syntax

    View Slide

  48. Alternate Syntax
    id = %x{uuidgen}
    id = %[email protected]@

    View Slide

  49. Alternate Syntax
    ‣Use this syntax when
    you need backticks in
    your command
    ‣any symbol can be a
    delimiter
    id = %x{uuidgen}
    id = %[email protected]@

    View Slide

  50. Alternate Syntax
    ‣Use this syntax when
    you need backticks in
    your command
    ‣any symbol can be a
    delimiter
    ‣You can also use the
    matching pairs: (…),
    […], {…}, and <…>
    ‣These nest properly
    id = %x{uuidgen}
    id = %[email protected]@

    View Slide

  51. No Output
    Needed
    Using system()

    View Slide

  52. Example:
    The Pasteboard

    View Slide

  53. Example:
    The Pasteboard
    ‣I want to put a search string on OS X’s
    find “pasteboard” (clipboard)

    View Slide

  54. Example:
    The Pasteboard
    ‣I want to put a search string on OS X’s
    find “pasteboard” (clipboard)
    ‣I don’t need any output for this
    operation

    View Slide

  55. Example:
    The Pasteboard
    ‣I want to put a search string on OS X’s
    find “pasteboard” (clipboard)
    ‣I don’t need any output for this
    operation
    ‣I just need to know if the operation
    succeeded
    ‣A simple true or false will do

    View Slide

  56. Ran or Didn’t Run

    View Slide

  57. Ran or Didn’t Run
    if system "pbcopy -pboard find <<< 'New Search String'"
    puts "Search string set."
    else
    puts "Could not search string."
    end

    View Slide

  58. Shell Expansion

    View Slide

  59. Shell Expansion
    ENV["MY_VAR"] = "Set from Ruby"
    !
    system "echo $MY_VAR"
    # >> Set from Ruby
    !
    system "echo", "$MY_VAR"
    # >> $MY_VAR

    View Slide

  60. Shell Expansion
    ENV["MY_VAR"] = "Set from Ruby"
    !
    system "echo $MY_VAR"
    # >> Set from Ruby
    !
    system "echo", "$MY_VAR"
    # >> $MY_VAR
    ‣A single argument
    goes through shell
    expansion
    ‣File glob patterns
    ‣Environment variables

    View Slide

  61. Shell Expansion
    ENV["MY_VAR"] = "Set from Ruby"
    !
    system "echo $MY_VAR"
    # >> Set from Ruby
    !
    system "echo", "$MY_VAR"
    # >> $MY_VAR
    ‣A single argument
    goes through shell
    expansion
    ‣File glob patterns
    ‣Environment variables
    ‣Multiple arguments
    are passed without
    going through
    expansion

    View Slide

  62. Handling
    Errors
    Mind the Expert Warnings

    View Slide

  63. When Trouble
    strikes

    View Slide

  64. When Trouble
    strikes
    ‣Remember to handle
    STDERR

    View Slide

  65. When Trouble
    strikes
    ‣Remember to handle
    STDERR
    ‣Check process exit
    status

    View Slide

  66. When Trouble
    strikes
    ‣Remember to handle
    STDERR
    ‣Check process exit
    status
    ‣Use popen3() when
    things get
    complicated

    View Slide

  67. Example:
    Backups

    View Slide

  68. Example:
    Backups
    ‣I want to backup a directory as part of
    a larger automation

    View Slide

  69. Example:
    Backups
    ‣I want to backup a directory as part of
    a larger automation
    ‣The rsync program can do what I need

    View Slide

  70. Example:
    Backups
    ‣I want to backup a directory as part of
    a larger automation
    ‣The rsync program can do what I need
    ‣I need to watch for problems and
    handle them gracefully
    ‣Possibly emailing a warning to the user

    View Slide

  71. STDERR,
    The Problem Child

    View Slide

  72. STDERR,
    The Problem Child

    View Slide

  73. STDERR,
    The Problem Child

    View Slide

  74. STDERR,
    The Problem Child

    View Slide

  75. STDERR,
    The Problem Child

    View Slide

  76. STDERR,
    The Problem Child

    View Slide

  77. Taming STDERR

    View Slide

  78. Taming STDERR
    dir = ARGV.shift or
    abort "USAGE: #{File.basename($PROGRAM_NAME)} DIR"
    results = `rsync -av --exclude '*.DS_Store' #{dir} #{dir}_backup 2>&1`
    if $?.success? # require "English"; $CHILD_STATUS.success?
    puts results.grep(/\A#{Regexp.escape(dir)}/)
    else
    puts "Error: Couldn't back up #{dir}"
    # …
    end

    View Slide

  79. Taming STDERR
    dir = ARGV.shift or
    abort "USAGE: #{File.basename($PROGRAM_NAME)} DIR"
    results = `rsync -av --exclude '*.DS_Store' #{dir} #{dir}_backup 2>&1`
    if $?.success? # require "English"; $CHILD_STATUS.success?
    puts results.grep(/\A#{Regexp.escape(dir)}/)
    else
    puts "Error: Couldn't back up #{dir}"
    # …
    end

    View Slide

  80. Taming STDERR
    dir = ARGV.shift or
    abort "USAGE: #{File.basename($PROGRAM_NAME)} DIR"
    results = `rsync -av --exclude '*.DS_Store' #{dir} #{dir}_backup 2>&1`
    if $?.success? # require "English"; $CHILD_STATUS.success?
    puts results.grep(/\A#{Regexp.escape(dir)}/)
    else
    puts "Error: Couldn't back up #{dir}"
    # …
    end

    View Slide

  81. Taming STDERR
    dir = ARGV.shift or
    abort "USAGE: #{File.basename($PROGRAM_NAME)} DIR"
    results = `rsync -av --exclude '*.DS_Store' #{dir} #{dir}_backup 2>&1`
    if $?.success? # require "English"; $CHILD_STATUS.success?
    puts results.grep(/\A#{Regexp.escape(dir)}/)
    else
    puts "Error: Couldn't back up #{dir}"
    # …
    end

    View Slide

  82. Proper Shell
    Escaping

    View Slide

  83. Proper Shell
    Escaping
    # escape text to make it useable in a shell script as
    # one “word” (string)
    def escape_for_shell(str)
    str.to_s.gsub( /(?=[^a-zA-Z0-9_.\/\-\x7F-\xFF\n])/, '\\' ).
    gsub( /\n/, "'\n'" ).
    sub( /^$/, "''" )
    end

    View Slide

  84. Tips for
    Avoiding Errors

    View Slide

  85. Tips for
    Avoiding Errors
    ‣Use full paths to programs and files
    whenever possible

    View Slide

  86. Tips for
    Avoiding Errors
    ‣Use full paths to programs and files
    whenever possible
    ‣Send data to STDIN when you can

    View Slide

  87. Tips for
    Avoiding Errors
    ‣Use full paths to programs and files
    whenever possible
    ‣Send data to STDIN when you can
    ‣If you can’t send it to STDIN, dump the
    data to a Tempfile and send that path

    View Slide

  88. Tips for
    Avoiding Errors
    ‣Use full paths to programs and files
    whenever possible
    ‣Send data to STDIN when you can
    ‣If you can’t send it to STDIN, dump the
    data to a Tempfile and send that path
    ‣Remember to shell escape any command-
    line arguments that could contain
    dangerous characters (even spaces)

    View Slide

  89. Full Control
    Using popen(), popen3(),
    and popen4()

    View Slide

  90. Managing Streams

    View Slide

  91. Managing Streams
    ‣Use popen() to
    manage STDIN and
    STDOUT

    View Slide

  92. Managing Streams
    ‣Use popen() to
    manage STDIN and
    STDOUT
    ‣Use popen3() to
    manage STDIN,
    STDOUT, and
    STDERR
    ‣Use popen4() if you also
    need the PID

    View Slide

  93. Example:
    Formatting Prose

    View Slide

  94. Example:
    Formatting Prose
    ‣I want to rewrap some prose provided by
    the user

    View Slide

  95. Example:
    Formatting Prose
    ‣I want to rewrap some prose provided by
    the user
    ‣Command-line arguments are not
    appropriate here
    ‣Complex shell Escaping
    ‣Size limit

    View Slide

  96. Example:
    Formatting Prose
    ‣I want to rewrap some prose provided by
    the user
    ‣Command-line arguments are not
    appropriate here
    ‣Complex shell Escaping
    ‣Size limit
    ‣I need to send the prose to fmt via
    STDIN

    View Slide

  97. Reading and
    Writing

    View Slide

  98. Reading and
    Writing
    prose = <Lorem ipsum dolor sit amet, consectetur adipisicing elit,
    sed do eiusmod tempor incididunt ut labore et dolore magna
    aliqua. Ut enim ad minim veniam, quis nostrud exercitation
    ullamco laboris nisi ut aliquip ex ea commodo consequat.
    Duis aute irure dolor in reprehenderit in voluptate velit
    esse cillum dolore eu fugiat nulla pariatur. Excepteur
    sint occaecat cupidatat non proident, sunt in culpa qui
    officia deserunt mollit anim id est laborum.
    END_PROSE
    !
    formatted = IO.popen("fmt -w 30", "r+") do |pipe|
    # open("| fmt -w 30", "r+") do |pipe|
    pipe << prose
    pipe.close_write
    pipe.read
    end

    View Slide

  99. Reading and
    Writing
    prose = <Lorem ipsum dolor sit amet, consectetur adipisicing elit,
    sed do eiusmod tempor incididunt ut labore et dolore magna
    aliqua. Ut enim ad minim veniam, quis nostrud exercitation
    ullamco laboris nisi ut aliquip ex ea commodo consequat.
    Duis aute irure dolor in reprehenderit in voluptate velit
    esse cillum dolore eu fugiat nulla pariatur. Excepteur
    sint occaecat cupidatat non proident, sunt in culpa qui
    officia deserunt mollit anim id est laborum.
    END_PROSE
    !
    formatted = IO.popen("fmt -w 30", "r+") do |pipe|
    # open("| fmt -w 30", "r+") do |pipe|
    pipe << prose
    pipe.close_write
    pipe.read
    end

    View Slide

  100. Reading and
    Writing
    prose = <Lorem ipsum dolor sit amet, consectetur adipisicing elit,
    sed do eiusmod tempor incididunt ut labore et dolore magna
    aliqua. Ut enim ad minim veniam, quis nostrud exercitation
    ullamco laboris nisi ut aliquip ex ea commodo consequat.
    Duis aute irure dolor in reprehenderit in voluptate velit
    esse cillum dolore eu fugiat nulla pariatur. Excepteur
    sint occaecat cupidatat non proident, sunt in culpa qui
    officia deserunt mollit anim id est laborum.
    END_PROSE
    !
    formatted = IO.popen("fmt -w 30", "r+") do |pipe|
    # open("| fmt -w 30", "r+") do |pipe|
    pipe << prose
    pipe.close_write
    pipe.read
    end

    View Slide

  101. Reading and
    Writing
    prose = <Lorem ipsum dolor sit amet, consectetur adipisicing elit,
    sed do eiusmod tempor incididunt ut labore et dolore magna
    aliqua. Ut enim ad minim veniam, quis nostrud exercitation
    ullamco laboris nisi ut aliquip ex ea commodo consequat.
    Duis aute irure dolor in reprehenderit in voluptate velit
    esse cillum dolore eu fugiat nulla pariatur. Excepteur
    sint occaecat cupidatat non proident, sunt in culpa qui
    officia deserunt mollit anim id est laborum.
    END_PROSE
    !
    formatted = IO.popen("fmt -w 30", "r+") do |pipe|
    # open("| fmt -w 30", "r+") do |pipe|
    pipe << prose
    pipe.close_write
    pipe.read
    end

    View Slide

  102. Reading and
    Writing
    prose = <Lorem ipsum dolor sit amet, consectetur adipisicing elit,
    sed do eiusmod tempor incididunt ut labore et dolore magna
    aliqua. Ut enim ad minim veniam, quis nostrud exercitation
    ullamco laboris nisi ut aliquip ex ea commodo consequat.
    Duis aute irure dolor in reprehenderit in voluptate velit
    esse cillum dolore eu fugiat nulla pariatur. Excepteur
    sint occaecat cupidatat non proident, sunt in culpa qui
    officia deserunt mollit anim id est laborum.
    END_PROSE
    !
    formatted = IO.popen("fmt -w 30", "r+") do |pipe|
    # open("| fmt -w 30", "r+") do |pipe|
    pipe << prose
    pipe.close_write
    pipe.read
    end

    View Slide

  103. Example:
    A Ruby Session

    View Slide

  104. Example:
    A Ruby Session
    ‣I want to run some Ruby code

    View Slide

  105. Example:
    A Ruby Session
    ‣I want to run some Ruby code
    ‣I don’t want that code to affect my
    current Ruby process

    View Slide

  106. Example:
    A Ruby Session
    ‣I want to run some Ruby code
    ‣I don’t want that code to affect my
    current Ruby process
    ‣I may also need to do some special
    setup, hacking Ruby’s core, before this
    code is run

    View Slide

  107. Example:
    A Ruby Session
    ‣I want to run some Ruby code
    ‣I don’t want that code to affect my
    current Ruby process
    ‣I may also need to do some special
    setup, hacking Ruby’s core, before this
    code is run
    ‣I need to format STDOUT and STDERR
    differently

    View Slide

  108. With Error
    Handling

    View Slide

  109. With Error
    Handling
    require "open3"
    !
    Open3.popen3("ruby") do |stdin, stdout, stderr|
    stdin << %Q{puts "I am a puppet."; oops!()}
    stdin.close
    puts "Output:"
    puts stdout.read
    puts "Errors:"
    puts stderr.read
    end

    View Slide

  110. With Error
    Handling
    require "open3"
    !
    Open3.popen3("ruby") do |stdin, stdout, stderr|
    stdin << %Q{puts "I am a puppet."; oops!()}
    stdin.close
    puts "Output:"
    puts stdout.read
    puts "Errors:"
    puts stderr.read
    end

    View Slide

  111. With Error
    Handling
    require "open3"
    !
    Open3.popen3("ruby") do |stdin, stdout, stderr|
    stdin << %Q{puts "I am a puppet."; oops!()}
    stdin.close
    puts "Output:"
    puts stdout.read
    puts "Errors:"
    puts stderr.read
    end

    View Slide

  112. With Error
    Handling
    require "open3"
    !
    Open3.popen3("ruby") do |stdin, stdout, stderr|
    stdin << %Q{puts "I am a puppet."; oops!()}
    stdin.close
    puts "Output:"
    puts stdout.read
    puts "Errors:"
    puts stderr.read
    end

    View Slide

  113. When you Also
    Need a PID

    View Slide

  114. When you Also
    Need a PID
    ‣Install the POpen4 gem
    ‣Unix version
    ‣Windows versions

    View Slide

  115. When you Also
    Need a PID
    ‣Install the POpen4 gem
    ‣Unix version
    ‣Windows versions
    ‣popen4() works like popen3() but it also
    passes you the PID for the child
    process
    ‣The PID is useful for sending the child process
    signals, possibly to kill the process

    View Slide

  116. “If it’s on the
    Web, it has
    an API.”
    !
    James Britt
    Using Web Tools

    View Slide

  117. Don’t Forget
    the Web

    View Slide

  118. Don’t Forget
    the Web
    If you need to… Use the tool…

    View Slide

  119. Don’t Forget
    the Web
    If you need to… Use the tool…
    Read Content open-uri

    View Slide

  120. Don’t Forget
    the Web
    If you need to… Use the tool…
    Read Content open-uri
    Write Form Data Net::HTTP

    View Slide

  121. Don’t Forget
    the Web
    If you need to… Use the tool…
    Read Content open-uri
    Write Form Data Net::HTTP
    Emulate a Browser Mechanize

    View Slide

  122. Don’t Forget
    the Web
    If you need to… Use the tool…
    Read Content open-uri
    Write Form Data Net::HTTP
    Emulate a Browser Mechanize
    Scrape HTML Hpricot

    View Slide

  123. Example:
    Tracking Ruby

    View Slide

  124. Example:
    Tracking Ruby
    ‣I want to download
    the latest version
    of Ruby as part of a
    larger automation

    View Slide

  125. Example:
    Tracking Ruby
    ‣I want to download
    the latest version
    of Ruby as part of a
    larger automation
    ‣I want to verify the
    contents of the
    download

    View Slide

  126. Example:
    Tracking Ruby
    ‣I want to download
    the latest version
    of Ruby as part of a
    larger automation
    ‣I want to verify the
    contents of the
    download
    ‣I want to expand the
    compressed archive

    View Slide

  127. Simple Scraping

    View Slide

  128. Simple Scraping
    require "open-uri"
    require "digest/md5"
    !
    require "rubygems"
    require "hpricot"
    !
    dl = Hpricot(open("http://www.ruby-lang.org/en/downloads/"))
    li = (dl / "div#content" / "ul" / "li").first
    url = (li / "a").first.attributes["href"]
    md5 = li.inner_html[/md5:.+?([A-Za-z0-9]{32})/, 1]
    !
    rb = open(url) { |ftp| ftp.read }
    if Digest::MD5.hexdigest(rb) == md5
    IO.popen("tar xvz", "wb") { |tar| tar << rb }
    else
    abort "Corrupt download"
    end

    View Slide

  129. Simple Scraping
    require "open-uri"
    require "digest/md5"
    !
    require "rubygems"
    require "hpricot"
    !
    dl = Hpricot(open("http://www.ruby-lang.org/en/downloads/"))
    li = (dl / "div#content" / "ul" / "li").first
    url = (li / "a").first.attributes["href"]
    md5 = li.inner_html[/md5:.+?([A-Za-z0-9]{32})/, 1]
    !
    rb = open(url) { |ftp| ftp.read }
    if Digest::MD5.hexdigest(rb) == md5
    IO.popen("tar xvz", "wb") { |tar| tar << rb }
    else
    abort "Corrupt download"
    end

    View Slide

  130. Simple Scraping
    require "open-uri"
    require "digest/md5"
    !
    require "rubygems"
    require "hpricot"
    !
    dl = Hpricot(open("http://www.ruby-lang.org/en/downloads/"))
    li = (dl / "div#content" / "ul" / "li").first
    url = (li / "a").first.attributes["href"]
    md5 = li.inner_html[/md5:.+?([A-Za-z0-9]{32})/, 1]
    !
    rb = open(url) { |ftp| ftp.read }
    if Digest::MD5.hexdigest(rb) == md5
    IO.popen("tar xvz", "wb") { |tar| tar << rb }
    else
    abort "Corrupt download"
    end

    View Slide

  131. Simple Scraping
    require "open-uri"
    require "digest/md5"
    !
    require "rubygems"
    require "hpricot"
    !
    dl = Hpricot(open("http://www.ruby-lang.org/en/downloads/"))
    li = (dl / "div#content" / "ul" / "li").first
    url = (li / "a").first.attributes["href"]
    md5 = li.inner_html[/md5:.+?([A-Za-z0-9]{32})/, 1]
    !
    rb = open(url) { |ftp| ftp.read }
    if Digest::MD5.hexdigest(rb) == md5
    IO.popen("tar xvz", "wb") { |tar| tar << rb }
    else
    abort "Corrupt download"
    end

    View Slide

  132. Use Caution

    View Slide

  133. Use Caution
    ‣These scraping techniques see wider use
    than talking to external processes

    View Slide

  134. Use Caution
    ‣These scraping techniques see wider use
    than talking to external processes
    ‣Ironically, they really do seem to be
    more fragile

    View Slide

  135. Use Caution
    ‣These scraping techniques see wider use
    than talking to external processes
    ‣Ironically, they really do seem to be
    more fragile
    ‣tips for managing scraping code:
    ‣Abstract out the scraping code
    ‣Use more aggressive error handling
    ‣Make sure the maintenance is worth it

    View Slide

  136. Summary
    Remain Strong

    View Slide

  137. Pop Quiz

    View Slide

  138. Pop Quiz
    Out of the box, can Ruby…

    View Slide

  139. Pop Quiz
    Out of the box, can Ruby…
    Apply a difference algorithm to the
    contents of two Strings?

    View Slide

  140. Pop Quiz
    Out of the box, can Ruby…
    Apply a difference algorithm to the
    contents of two Strings?
    Efficiently read a file
    line by line in reverse?

    View Slide

  141. YES!

    View Slide

  142. YES!
    ‣Don’t be afraid
    to use your
    powers

    View Slide

  143. YES!
    ‣Don’t be afraid
    to use your
    powers
    ‣You will
    literally be able
    to accomplish
    anything

    View Slide

  144. String Diff

    View Slide

  145. String Diff
    require "tempfile"
    !
    class String
    def diff(other)
    st = Tempfile.new("diff_self")
    ot = Tempfile.new("diff_other")
    st << self
    ot << other
    [st, ot].each { |t| t.flush }
    `diff -u #{st.path} #{ot.path}`[/^@.+\z/m]
    end
    end
    !
    puts "one\ntwo\n".diff("one\nthree\n")
    # >> @@ -1,2 +1,2 @@
    # >> one
    # >> -two
    # >> +three

    View Slide

  146. Reading
    Backwards

    View Slide

  147. Reading
    Backwards
    unless ARGV.size == 1 and File.exist? ARGV.first
    abort "Usage: #{File.basename($PROGRAM_NAME)} FILE"
    end
    !
    last_five_lines = Array.new
    !
    IO.popen("tail -r #{ARGV.shift}") do |tail|
    tail.each do |line|
    last_five_lines << line
    break if last_five_lines.size == 5
    end
    end
    last_five_lines.reverse!
    !
    puts last_five_lines

    View Slide

  148. Questions?

    View Slide