Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Hidden Gems

jeg2
September 05, 2008

Hidden Gems

Some ways to coax some extra speed out of Ruby processes.

jeg2

September 05, 2008
Tweet

More Decks by jeg2

Other Decks in Technology

Transcript

  1. Ruby in the Fast Lane
    Hidden Gems

    View Slide

  2. Back at Lone Star

    View Slide

  3. Back at Lone Star
    ‣I am James Edward Gray II

    View Slide

  4. Back at Lone Star
    ‣I am James Edward Gray II
    ‣Created the Ruby Quiz

    View Slide

  5. Back at Lone Star
    ‣I am James Edward Gray II
    ‣Created the Ruby Quiz
    ‣released FasterCSV, HighLine, and Elif

    View Slide

  6. Back at Lone Star
    ‣I am James Edward Gray II
    ‣Created the Ruby Quiz
    ‣released FasterCSV, HighLine, and Elif
    ‣Wrote a couple of Pragmatic Programmer books
    with a lot of Ruby in them

    View Slide

  7. Back at Lone Star
    ‣I am James Edward Gray II
    ‣Created the Ruby Quiz
    ‣released FasterCSV, HighLine, and Elif
    ‣Wrote a couple of Pragmatic Programmer books
    with a lot of Ruby in them
    ‣I’ve given a talk at every LSRC so far

    View Slide

  8. View Slide

  9. View Slide

  10. Why BSG?

    View Slide

  11. Why BSG?
    ‣The old ship
    that was to be
    decommissioned
    becomes the
    only thing
    keeping the
    human race alive

    View Slide

  12. Why BSG?
    ‣The old ship
    that was to be
    decommissioned
    becomes the
    only thing
    keeping the
    human race alive
    ‣Those are my kind
    of odds

    View Slide

  13. Ugly Rumor

    View Slide

  14. “Ruby is slow.”
    Ugly Rumor

    View Slide

  15. My Opinion of the
    Speed Rumor

    View Slide

  16. BS
    My Opinion of the
    Speed Rumor

    View Slide

  17. Ruby is as Fast as
    we Want her to Be!

    View Slide

  18. Ruby is as Fast as
    we Want her to Be!
    ‣We can always write C extensions for
    speed-critical portions

    View Slide

  19. Ruby is as Fast as
    we Want her to Be!
    ‣We can always write C extensions for
    speed-critical portions
    ‣This is rarely actually needed though

    View Slide

  20. Ruby is as Fast as
    we Want her to Be!
    ‣We can always write C extensions for
    speed-critical portions
    ‣This is rarely actually needed though
    ‣We can make use of libraries, some
    written in C, that help with our problem

    View Slide

  21. Ruby is as Fast as
    we Want her to Be!
    ‣We can always write C extensions for
    speed-critical portions
    ‣This is rarely actually needed though
    ‣We can make use of libraries, some
    written in C, that help with our problem
    ‣We can add more processing power

    View Slide

  22. Ruby is as Fast as
    we Want her to Be!
    ‣We can always write C extensions for
    speed-critical portions
    ‣This is rarely actually needed though
    ‣We can make use of libraries, some
    written in C, that help with our problem
    ‣We can add more processing power
    ‣We can rework our data structures to
    better support the task at hand

    View Slide

  23. Ruby is as Fast as
    we Want her to Be!
    ‣We can always write C extensions for
    speed-critical portions
    ‣This is rarely actually needed though
    ‣We can make use of libraries, some
    written in C, that help with our problem
    ‣We can add more processing power
    ‣We can rework our data structures to
    better support the task at hand
    ‣Always the big win, in my opinion

    View Slide

  24. Ruby in the
    Fast Lane

    View Slide

  25. Ruby in the
    Fast Lane
    ‣Let’s see how fast Ruby can run using:

    View Slide

  26. Ruby in the
    Fast Lane
    ‣Let’s see how fast Ruby can run using:
    ‣NArray

    View Slide

  27. Ruby in the
    Fast Lane
    ‣Let’s see how fast Ruby can run using:
    ‣NArray
    ‣SQLite

    View Slide

  28. Ruby in the
    Fast Lane
    ‣Let’s see how fast Ruby can run using:
    ‣NArray
    ‣SQLite
    ‣RBTree

    View Slide

  29. Ruby in the
    Fast Lane
    ‣Let’s see how fast Ruby can run using:
    ‣NArray
    ‣SQLite
    ‣RBTree
    ‣FSDB

    View Slide

  30. Ruby in the
    Fast Lane
    ‣Let’s see how fast Ruby can run using:
    ‣NArray
    ‣SQLite
    ‣RBTree
    ‣FSDB
    ‣Rinda

    View Slide

  31. Ruby in the
    Fast Lane
    ‣Let’s see how fast Ruby can run using:
    ‣NArray
    ‣SQLite
    ‣RBTree
    ‣FSDB
    ‣Rinda
    ‣Thinking outside the box

    View Slide

  32. Super Fast Number crunching

    View Slide

  33. Super Fast Number crunching
    NArray

    View Slide

  34. When the
    Numbers Count

    View Slide

  35. When the
    Numbers Count
    ‣Ruby’s Numeric
    family of objects
    were built for
    ease of use

    View Slide

  36. When the
    Numbers Count
    ‣Ruby’s Numeric
    family of objects
    were built for
    ease of use
    ‣This makes them a
    bit slower

    View Slide

  37. When the
    Numbers Count
    ‣Ruby’s Numeric
    family of objects
    were built for
    ease of use
    ‣This makes them a
    bit slower
    ‣C’s numbers were
    built for speed

    View Slide

  38. When the
    Numbers Count
    ‣Ruby’s Numeric
    family of objects
    were built for
    ease of use
    ‣This makes them a
    bit slower
    ‣C’s numbers were
    built for speed
    ‣Ruby can borrow
    them with NArray

    View Slide

  39. Problem:
    Faster Imaging

    View Slide

  40. Problem:
    Faster Imaging
    ‣I use a trivial PPM library to generate
    images for one project

    View Slide

  41. Problem:
    Faster Imaging
    ‣I use a trivial PPM library to generate
    images for one project
    ‣The PPM code was taking about 1.3 seconds
    for a 400 by 200 pixel image

    View Slide

  42. Problem:
    Faster Imaging
    ‣I use a trivial PPM library to generate
    images for one project
    ‣The PPM code was taking about 1.3 seconds
    for a 400 by 200 pixel image
    ‣I replaced a two dimensional Array of
    Color objects with a 3D NArray

    View Slide

  43. Problem:
    Faster Imaging
    ‣I use a trivial PPM library to generate
    images for one project
    ‣The PPM code was taking about 1.3 seconds
    for a 400 by 200 pixel image
    ‣I replaced a two dimensional Array of
    Color objects with a 3D NArray
    ‣I changed less than ten lines of code

    View Slide

  44. Problem:
    Faster Imaging
    ‣I use a trivial PPM library to generate
    images for one project
    ‣The PPM code was taking about 1.3 seconds
    for a 400 by 200 pixel image
    ‣I replaced a two dimensional Array of
    Color objects with a 3D NArray
    ‣I changed less than ten lines of code
    ‣The speed on the same image dropped to about
    1/100th of a second

    View Slide

  45. Creating The
    Canvas

    View Slide

  46. Creating The
    Canvas
    def initialize(options = Hash.new)
    options = DEFAULT_OPTIONS.merge(options)
    @width = options[:width]
    @height = options[:height]
    @background = options[:background]
    @foreground = options[:foreground]
    @mode = options[:mode]
    @canvas = Array.new(@height) { Array.new(@width) { @background } }
    end

    View Slide

  47. Creating The
    Canvas
    def initialize(options = Hash.new)
    options = DEFAULT_OPTIONS.merge(options)
    @width = options[:width]
    @height = options[:height]
    @background = options[:background]
    @foreground = options[:foreground]
    @mode = options[:mode]
    @canvas = Array.new(@height) { Array.new(@width) { @background } }
    end

    View Slide

  48. Creating The
    Canvas
    def initialize(options = Hash.new)
    require "rubygems"
    require "narray"
    !
    options = DEFAULT_OPTIONS.merge(options)
    @width = options[:width]
    @height = options[:height]
    @background = options[:background]
    @foreground = options[:foreground]
    @mode = options[:mode]
    @canvas = NArray.byte(@width, @height, 3)
    end

    View Slide

  49. Creating The
    Canvas
    def initialize(options = Hash.new)
    require "rubygems"
    require "narray"
    !
    options = DEFAULT_OPTIONS.merge(options)
    @width = options[:width]
    @height = options[:height]
    @background = options[:background]
    @foreground = options[:foreground]
    @mode = options[:mode]
    @canvas = NArray.byte(@width, @height, 3)
    end

    View Slide

  50. Marking Pixels

    View Slide

  51. Marking Pixels
    def draw_point(x, y, color = @foreground)
    return unless x.between? 0, @width - 1
    return unless y.between? 0, @height - 1
    !
    @canvas[y][x] = color
    end

    View Slide

  52. Marking Pixels
    def draw_point(x, y, color = @foreground)
    return unless x.between? 0, @width - 1
    return unless y.between? 0, @height - 1
    !
    @canvas[y][x] = color
    end

    View Slide

  53. Marking Pixels
    def draw_point(x, y, color = @foreground)
    return unless x.between? 0, @width - 1
    return unless y.between? 0, @height - 1
    !
    @canvas[x, y, 0..2] = color.to_a
    end

    View Slide

  54. Marking Pixels
    def draw_point(x, y, color = @foreground)
    return unless x.between? 0, @width - 1
    return unless y.between? 0, @height - 1
    !
    @canvas[x, y, 0..2] = color.to_a
    end

    View Slide

  55. Drawing an Image

    View Slide

  56. Drawing an Image
    def save(file)
    File.open(file.sub(/\.ppm$/i, "") + ".ppm", "w") do |image|
    image.puts @mode
    image.puts "#{@width} #{@height} 255"
    @canvas.each do |row|
    pixels = row.map { |pixel| pixel.to_s(@mode) }
    image.send( @mode == "P6" ? :print : :puts,
    pixels.join(@mode == "P6" ? "" : " ") )
    end
    end
    end

    View Slide

  57. Drawing an Image
    def save(file)
    File.open(file.sub(/\.ppm$/i, "") + ".ppm", "w") do |image|
    image.puts @mode
    image.puts "#{@width} #{@height} 255"
    @canvas.each do |row|
    pixels = row.map { |pixel| pixel.to_s(@mode) }
    image.send( @mode == "P6" ? :print : :puts,
    pixels.join(@mode == "P6" ? "" : " ") )
    end
    end
    end

    View Slide

  58. Drawing an Image
    def save(file)
    File.open(file.sub(/\.ppm$/i, "") + ".ppm", "w") do |image|
    image.puts @mode
    image.puts "#{@width} #{@height} 255"
    0.upto(@height - 1) do |y|
    row = @canvas[[email protected], y, 0..2].transpose(-1, 0)
    image.send( @mode == "P6" ? :print : :puts,
    @mode == "P6" ? row.to_s : row.to_a.join(" ") )
    end
    end
    end

    View Slide

  59. Drawing an Image
    def save(file)
    File.open(file.sub(/\.ppm$/i, "") + ".ppm", "w") do |image|
    image.puts @mode
    image.puts "#{@width} #{@height} 255"
    0.upto(@height - 1) do |y|
    row = @canvas[[email protected], y, 0..2].transpose(-1, 0)
    image.send( @mode == "P6" ? :print : :puts,
    @mode == "P6" ? row.to_s : row.to_a.join(" ") )
    end
    end
    end

    View Slide

  60. Other Nice
    Features

    View Slide

  61. Other Nice
    Features
    ‣NArray supports integers, floats, and
    even complex numbers in various sizes

    View Slide

  62. Other Nice
    Features
    ‣NArray supports integers, floats, and
    even complex numbers in various sizes
    ‣Aside from indexing and iteration,
    NArray supports data generation,
    arithmetic operations, comparisons,
    bitwise manipulations, statistic
    calculations, and more

    View Slide

  63. Other Nice
    Features
    ‣NArray supports integers, floats, and
    even complex numbers in various sizes
    ‣Aside from indexing and iteration,
    NArray supports data generation,
    arithmetic operations, comparisons,
    bitwise manipulations, statistic
    calculations, and more
    ‣View the large API with examples At:
    http://narray.rubyforge.org/

    View Slide

  64. Conway’s Game
    of Life

    View Slide

  65. Conway’s Game
    of Life
    #!/usr/bin/env ruby -wKU
    !
    require "rubygems"
    require "narray"
    !
    # build cells
    life = NArray.byte(5, 5)
    life[1, 1] = NArray.byte(3, 3).random!(2)
    p life
    !
    # count neighbors
    counts = NArray.byte(*life.shape)
    counts[1..-2, 1..-2] =
    life[0..-3, 0..-3] + life[0..-3, 1..-2] + life[0..-3, 2..-1] +
    life[1..-2, 0..-3] + life[1..-2, 2..-1] +
    life[2..-1, 0..-3] + life[2..-1, 1..-2] + life[2..-1, 2..-1]
    p counts
    !
    # one step of the game
    life[] = counts.eq(3) | (counts.eq(2) & life)
    p life

    View Slide

  66. Conway’s Game
    of Life
    #!/usr/bin/env ruby -wKU
    !
    require "rubygems"
    require "narray"
    !
    # build cells
    life = NArray.byte(5, 5)
    life[1, 1] = NArray.byte(3, 3).random!(2)
    p life
    !
    # count neighbors
    counts = NArray.byte(*life.shape)
    counts[1..-2, 1..-2] =
    life[0..-3, 0..-3] + life[0..-3, 1..-2] + life[0..-3, 2..-1] +
    life[1..-2, 0..-3] + life[1..-2, 2..-1] +
    life[2..-1, 0..-3] + life[2..-1, 1..-2] + life[2..-1, 2..-1]
    p counts
    !
    # one step of the game
    life[] = counts.eq(3) | (counts.eq(2) & life)
    p life

    View Slide

  67. Conway’s Game
    of Life
    NArray.byte(5,5):
    [ [ 0, 0, 0, 0, 0 ],
    [ 0, 1, 0, 1, 0 ],
    [ 0, 0, 0, 0, 0 ],
    [ 0, 1, 1, 0, 0 ],
    [ 0, 0, 0, 0, 0 ] ]
    #!/usr/bin/env ruby -wKU
    !
    require "rubygems"
    require "narray"
    !
    # build cells
    life = NArray.byte(5, 5)
    life[1, 1] = NArray.byte(3, 3).random!(2)
    p life
    !
    # count neighbors
    counts = NArray.byte(*life.shape)
    counts[1..-2, 1..-2] =
    life[0..-3, 0..-3] + life[0..-3, 1..-2] + life[0..-3, 2..-1] +
    life[1..-2, 0..-3] + life[1..-2, 2..-1] +
    life[2..-1, 0..-3] + life[2..-1, 1..-2] + life[2..-1, 2..-1]
    p counts
    !
    # one step of the game
    life[] = counts.eq(3) | (counts.eq(2) & life)
    p life

    View Slide

  68. Conway’s Game
    of Life
    NArray.byte(5,5):
    [ [ 0, 0, 0, 0, 0 ],
    [ 0, 1, 0, 1, 0 ],
    [ 0, 0, 0, 0, 0 ],
    [ 0, 1, 1, 0, 0 ],
    [ 0, 0, 0, 0, 0 ] ]
    #!/usr/bin/env ruby -wKU
    !
    require "rubygems"
    require "narray"
    !
    # build cells
    life = NArray.byte(5, 5)
    life[1, 1] = NArray.byte(3, 3).random!(2)
    p life
    !
    # count neighbors
    counts = NArray.byte(*life.shape)
    counts[1..-2, 1..-2] =
    life[0..-3, 0..-3] + life[0..-3, 1..-2] + life[0..-3, 2..-1] +
    life[1..-2, 0..-3] + life[1..-2, 2..-1] +
    life[2..-1, 0..-3] + life[2..-1, 1..-2] + life[2..-1, 2..-1]
    p counts
    !
    # one step of the game
    life[] = counts.eq(3) | (counts.eq(2) & life)
    p life

    View Slide

  69. Conway’s Game
    of Life
    NArray.byte(5,5):
    [ [ 0, 0, 0, 0, 0 ],
    [ 0, 1, 0, 1, 0 ],
    [ 0, 0, 0, 0, 0 ],
    [ 0, 1, 1, 0, 0 ],
    [ 0, 0, 0, 0, 0 ] ]
    NArray.byte(5,5):
    [ [ 0, 0, 0, 0, 0 ],
    [ 0, 0, 2, 0, 0 ],
    [ 0, 3, 4, 2, 0 ],
    [ 0, 1, 1, 1, 0 ],
    [ 0, 0, 0, 0, 0 ] ]
    #!/usr/bin/env ruby -wKU
    !
    require "rubygems"
    require "narray"
    !
    # build cells
    life = NArray.byte(5, 5)
    life[1, 1] = NArray.byte(3, 3).random!(2)
    p life
    !
    # count neighbors
    counts = NArray.byte(*life.shape)
    counts[1..-2, 1..-2] =
    life[0..-3, 0..-3] + life[0..-3, 1..-2] + life[0..-3, 2..-1] +
    life[1..-2, 0..-3] + life[1..-2, 2..-1] +
    life[2..-1, 0..-3] + life[2..-1, 1..-2] + life[2..-1, 2..-1]
    p counts
    !
    # one step of the game
    life[] = counts.eq(3) | (counts.eq(2) & life)
    p life

    View Slide

  70. Conway’s Game
    of Life
    NArray.byte(5,5):
    [ [ 0, 0, 0, 0, 0 ],
    [ 0, 1, 0, 1, 0 ],
    [ 0, 0, 0, 0, 0 ],
    [ 0, 1, 1, 0, 0 ],
    [ 0, 0, 0, 0, 0 ] ]
    NArray.byte(5,5):
    [ [ 0, 0, 0, 0, 0 ],
    [ 0, 0, 2, 0, 0 ],
    [ 0, 3, 4, 2, 0 ],
    [ 0, 1, 1, 1, 0 ],
    [ 0, 0, 0, 0, 0 ] ]
    #!/usr/bin/env ruby -wKU
    !
    require "rubygems"
    require "narray"
    !
    # build cells
    life = NArray.byte(5, 5)
    life[1, 1] = NArray.byte(3, 3).random!(2)
    p life
    !
    # count neighbors
    counts = NArray.byte(*life.shape)
    counts[1..-2, 1..-2] =
    life[0..-3, 0..-3] + life[0..-3, 1..-2] + life[0..-3, 2..-1] +
    life[1..-2, 0..-3] + life[1..-2, 2..-1] +
    life[2..-1, 0..-3] + life[2..-1, 1..-2] + life[2..-1, 2..-1]
    p counts
    !
    # one step of the game
    life[] = counts.eq(3) | (counts.eq(2) & life)
    p life

    View Slide

  71. Conway’s Game
    of Life
    NArray.byte(5,5):
    [ [ 0, 0, 0, 0, 0 ],
    [ 0, 1, 0, 1, 0 ],
    [ 0, 0, 0, 0, 0 ],
    [ 0, 1, 1, 0, 0 ],
    [ 0, 0, 0, 0, 0 ] ]
    NArray.byte(5,5):
    [ [ 0, 0, 0, 0, 0 ],
    [ 0, 0, 2, 0, 0 ],
    [ 0, 3, 4, 2, 0 ],
    [ 0, 1, 1, 1, 0 ],
    [ 0, 0, 0, 0, 0 ] ]
    NArray.byte(5,5):
    [ [ 0, 0, 0, 0, 0 ],
    [ 0, 0, 0, 0, 0 ],
    [ 0, 1, 0, 0, 0 ],
    [ 0, 0, 0, 0, 0 ],
    [ 0, 0, 0, 0, 0 ] ]
    #!/usr/bin/env ruby -wKU
    !
    require "rubygems"
    require "narray"
    !
    # build cells
    life = NArray.byte(5, 5)
    life[1, 1] = NArray.byte(3, 3).random!(2)
    p life
    !
    # count neighbors
    counts = NArray.byte(*life.shape)
    counts[1..-2, 1..-2] =
    life[0..-3, 0..-3] + life[0..-3, 1..-2] + life[0..-3, 2..-1] +
    life[1..-2, 0..-3] + life[1..-2, 2..-1] +
    life[2..-1, 0..-3] + life[2..-1, 1..-2] + life[2..-1, 2..-1]
    p counts
    !
    # one step of the game
    life[] = counts.eq(3) | (counts.eq(2) & life)
    p life

    View Slide

  72. A Data DSL

    View Slide

  73. A Data DSL
    SQLite

    View Slide

  74. Thinking About
    Data can be Hard

    View Slide

  75. Thinking About
    Data can be Hard
    ‣SQLite has
    already solved
    many hard
    problems for
    data storage
    and retrieval

    View Slide

  76. Thinking About
    Data can be Hard
    ‣SQLite has
    already solved
    many hard
    problems for
    data storage
    and retrieval
    ‣It gives you an
    entire language
    to express your
    data needs

    View Slide

  77. Problem:
    IP to Country

    View Slide

  78. Problem:
    IP to Country
    ‣Given an IP address,
    return the country
    for that IP

    View Slide

  79. Problem:
    IP to Country
    ‣Given an IP address,
    return the country
    for that IP
    ‣this was a Ruby Quiz

    View Slide

  80. Problem:
    IP to Country
    ‣Given an IP address,
    return the country
    for that IP
    ‣this was a Ruby Quiz
    ‣Solutions were also
    expected to be efficient
    in memory and speed

    View Slide

  81. Problem:
    IP to Country
    ‣Given an IP address,
    return the country
    for that IP
    ‣this was a Ruby Quiz
    ‣Solutions were also
    expected to be efficient
    in memory and speed
    ‣This is a real world
    task I’ve had to do
    for my job

    View Slide

  82. The Data

    View Slide

  83. The Data
    # © 2002-2008 Webnet77.com
    #
    #
    #
    #
    "0","16777215","IANA","410227200","ZZ","ZZZ","RESERVED"
    "50331648","67108863","ARIN","572572800","US","USA","UNITED STATES"
    "67108864","83886079","ARIN","0","US","USA","UNITED STATES"
    "100663296","117440511","ARIN","0","US","USA","UNITED STATES"
    "117440512","134217727","ARIN","880329600","US","USA","UNITED STATES"

    View Slide

  84. The Data
    # © 2002-2008 Webnet77.com
    #
    #
    #
    #
    "0","16777215","IANA","410227200","ZZ","ZZZ","RESERVED"
    "50331648","67108863","ARIN","572572800","US","USA","UNITED STATES"
    "67108864","83886079","ARIN","0","US","USA","UNITED STATES"
    "100663296","117440511","ARIN","0","US","USA","UNITED STATES"
    "117440512","134217727","ARIN","880329600","US","USA","UNITED STATES"

    View Slide

  85. Solutions

    View Slide

  86. Solutions
    ‣Many solved the problem with a binary
    search on the file

    View Slide

  87. Solutions
    ‣Many solved the problem with a binary
    search on the file
    ‣Most of those prepossessed the file to make
    that search easier

    View Slide

  88. Solutions
    ‣Many solved the problem with a binary
    search on the file
    ‣Most of those prepossessed the file to make
    that search easier
    ‣I’m going to show a SQLite solution

    View Slide

  89. Solutions
    ‣Many solved the problem with a binary
    search on the file
    ‣Most of those prepossessed the file to make
    that search easier
    ‣I’m going to show a SQLite solution
    ‣It’s very close to the same speed (about 1/3rd
    of a second to lookup an IP)

    View Slide

  90. Solutions
    ‣Many solved the problem with a binary
    search on the file
    ‣Most of those prepossessed the file to make
    that search easier
    ‣I’m going to show a SQLite solution
    ‣It’s very close to the same speed (about 1/3rd
    of a second to lookup an IP)
    ‣I didn’t have to be clever or even add an index

    View Slide

  91. Solutions
    ‣Many solved the problem with a binary
    search on the file
    ‣Most of those prepossessed the file to make
    that search easier
    ‣I’m going to show a SQLite solution
    ‣It’s very close to the same speed (about 1/3rd
    of a second to lookup an IP)
    ‣I didn’t have to be clever or even add an index
    ‣It was easier for me to use full country names

    View Slide

  92. Setup

    View Slide

  93. Setup
    #!/usr/bin/env ruby -KU
    !
    require "open-uri"
    require "zlib"
    !
    require "rubygems"
    require "faster_csv"
    require "sqlite3"
    !
    REMOTE_DB = "http://software77.net/cgi-bin/" +
    "ip-country/geo-ip.pl?action=download"
    LOCAL_DB = "country_ips.sqlite"
    !
    File.unlink(LOCAL_DB) if ARGV.delete("-r") and
    File.exist? LOCAL_DB
    !
    # ...

    View Slide

  94. Build The Database

    View Slide

  95. Build The Database
    # ...
    !
    unless File.exist? LOCAL_DB
    db = SQLite3::Database.new(LOCAL_DB)
    db.execute(<<-END_TABLE.strip)
    CREATE TABLE ips ( low_ip INTEGER,
    high_ip INTEGER,
    country TEXT )
    END_TABLE
    open(REMOTE_DB) do |url|
    Zlib::GzipReader.new(url).each do |line|
    next if line =~ /\A\s*(?:#|\z)/
    args = FCSV.parse_line(line).values_at(0..1, 6)
    db.execute(<<-END_INSERT.strip, *args)
    INSERT INTO ips( low_ip, high_ip, country)
    VALUES( ?, ?, ?)
    END_INSERT
    end
    end
    db.close
    end
    !
    # ...

    View Slide

  96. Build The Database
    # ...
    !
    unless File.exist? LOCAL_DB
    db = SQLite3::Database.new(LOCAL_DB)
    db.execute(<<-END_TABLE.strip)
    CREATE TABLE ips ( low_ip INTEGER,
    high_ip INTEGER,
    country TEXT )
    END_TABLE
    open(REMOTE_DB) do |url|
    Zlib::GzipReader.new(url).each do |line|
    next if line =~ /\A\s*(?:#|\z)/
    args = FCSV.parse_line(line).values_at(0..1, 6)
    db.execute(<<-END_INSERT.strip, *args)
    INSERT INTO ips( low_ip, high_ip, country)
    VALUES( ?, ?, ?)
    END_INSERT
    end
    end
    db.close
    end
    !
    # ...

    View Slide

  97. Build The Database
    # ...
    !
    unless File.exist? LOCAL_DB
    db = SQLite3::Database.new(LOCAL_DB)
    db.execute(<<-END_TABLE.strip)
    CREATE TABLE ips ( low_ip INTEGER,
    high_ip INTEGER,
    country TEXT )
    END_TABLE
    open(REMOTE_DB) do |url|
    Zlib::GzipReader.new(url).each do |line|
    next if line =~ /\A\s*(?:#|\z)/
    args = FCSV.parse_line(line).values_at(0..1, 6)
    db.execute(<<-END_INSERT.strip, *args)
    INSERT INTO ips( low_ip, high_ip, country)
    VALUES( ?, ?, ?)
    END_INSERT
    end
    end
    db.close
    end
    !
    # ...

    View Slide

  98. Query

    View Slide

  99. Query
    # ...
    !
    ip = ARGV.shift or
    abort "Usage: #{File.basename($PROGRAM_NAME)} IP"
    ip_int = ip.split(".").map { |n| Integer(n) }.
    pack("C*").unpack("N").first
    db = SQLite3::Database.new(LOCAL_DB)
    !
    puts db.get_first_value(<<-END_SELECT.strip, :ip => ip_int) || "Unknown"
    SELECT country FROM ips WHERE low_ip <= :ip AND
    :ip <= high_ip
    END_SELECT

    View Slide

  100. Query
    # ...
    !
    ip = ARGV.shift or
    abort "Usage: #{File.basename($PROGRAM_NAME)} IP"
    ip_int = ip.split(".").map { |n| Integer(n) }.
    pack("C*").unpack("N").first
    db = SQLite3::Database.new(LOCAL_DB)
    !
    puts db.get_first_value(<<-END_SELECT.strip, :ip => ip_int) || "Unknown"
    SELECT country FROM ips WHERE low_ip <= :ip AND
    :ip <= high_ip
    END_SELECT

    View Slide

  101. Did You Know?

    View Slide

  102. Did You Know?
    ‣SQLite is totally free

    View Slide

  103. Did You Know?
    ‣SQLite is totally free
    ‣You can receive query results in a Hash to
    index by column name and/or convert column
    values to Ruby objects based on type

    View Slide

  104. Did You Know?
    ‣SQLite is totally free
    ‣You can receive query results in a Hash to
    index by column name and/or convert column
    values to Ruby objects based on type
    ‣It can work with in-memory databases

    View Slide

  105. Did You Know?
    ‣SQLite is totally free
    ‣You can receive query results in a Hash to
    index by column name and/or convert column
    values to Ruby objects based on type
    ‣It can work with in-memory databases
    ‣It can run queries across tables in multiple
    database files

    View Slide

  106. Did You Know?
    ‣SQLite is totally free
    ‣You can receive query results in a Hash to
    index by column name and/or convert column
    values to Ruby objects based on type
    ‣It can work with in-memory databases
    ‣It can run queries across tables in multiple
    database files
    ‣You can define SQL functions and
    aggregates for it in Ruby code

    View Slide

  107. Ruby Friendly Data

    View Slide

  108. Ruby Friendly Data
    #!/usr/bin/env ruby -KU
    !
    require "rubygems"
    require "sqlite3"
    !
    country = ARGV.shift or
    abort "Usage: #{File.basename($PROGRAM_NAME)} COUNTRY"
    db = SQLite3::Database.new("country_ips.sqlite")
    db.results_as_hash = true
    db.type_translation = true
    !
    db.execute( "SELECT * FROM ips WHERE country LIKE ?",
    "%#{country}%" ) do |match|
    low, high = match.values_at("low_ip", "high_ip").
    map { |i| [i].pack("N").unpack("C*").join(".") }
    puts "%s: %15s - %15s" % [match["country"], low, high]
    end

    View Slide

  109. Ruby Friendly Data
    #!/usr/bin/env ruby -KU
    !
    require "rubygems"
    require "sqlite3"
    !
    country = ARGV.shift or
    abort "Usage: #{File.basename($PROGRAM_NAME)} COUNTRY"
    db = SQLite3::Database.new("country_ips.sqlite")
    db.results_as_hash = true
    db.type_translation = true
    !
    db.execute( "SELECT * FROM ips WHERE country LIKE ?",
    "%#{country}%" ) do |match|
    low, high = match.values_at("low_ip", "high_ip").
    map { |i| [i].pack("N").unpack("C*").join(".") }
    puts "%s: %15s - %15s" % [match["country"], low, high]
    end

    View Slide

  110. Ruby Friendly Data
    #!/usr/bin/env ruby -KU
    !
    require "rubygems"
    require "sqlite3"
    !
    country = ARGV.shift or
    abort "Usage: #{File.basename($PROGRAM_NAME)} COUNTRY"
    db = SQLite3::Database.new("country_ips.sqlite")
    db.results_as_hash = true
    db.type_translation = true
    !
    db.execute( "SELECT * FROM ips WHERE country LIKE ?",
    "%#{country}%" ) do |match|
    low, high = match.values_at("low_ip", "high_ip").
    map { |i| [i].pack("N").unpack("C*").join(".") }
    puts "%s: %15s - %15s" % [match["country"], low, high]
    end

    View Slide

  111. In-Memory

    View Slide

  112. In-Memory
    #!/usr/bin/env ruby -KU
    !
    # ... requires unchanged ...
    !
    REMOTE_DB = "http://software77.net/cgi-bin/" +
    "ip-country/geo-ip.pl?action=download"
    !
    db = SQLite3::Database.new(":memory:")
    !
    # ... database loading unchanged ...
    !
    stmt = db.prepare(<<-END_SELECT.strip)
    SELECT country FROM ips WHERE low_ip <= :ip AND :ip <= high_ip
    END_SELECT
    loop do
    print "IP address? "
    ip = gets.to_s.strip
    if ip =~ /\S/
    ip_int = ip.split(".").map { |n| Integer(n) }.
    pack("C*").unpack("N").first
    puts stmt.execute(:ip => ip_int).each { |c| break c[0] } || "Unknown"
    else
    break
    end
    end

    View Slide

  113. In-Memory
    #!/usr/bin/env ruby -KU
    !
    # ... requires unchanged ...
    !
    REMOTE_DB = "http://software77.net/cgi-bin/" +
    "ip-country/geo-ip.pl?action=download"
    !
    db = SQLite3::Database.new(":memory:")
    !
    # ... database loading unchanged ...
    !
    stmt = db.prepare(<<-END_SELECT.strip)
    SELECT country FROM ips WHERE low_ip <= :ip AND :ip <= high_ip
    END_SELECT
    loop do
    print "IP address? "
    ip = gets.to_s.strip
    if ip =~ /\S/
    ip_int = ip.split(".").map { |n| Integer(n) }.
    pack("C*").unpack("N").first
    puts stmt.execute(:ip => ip_int).each { |c| break c[0] } || "Unknown"
    else
    break
    end
    end

    View Slide

  114. In-Memory
    #!/usr/bin/env ruby -KU
    !
    # ... requires unchanged ...
    !
    REMOTE_DB = "http://software77.net/cgi-bin/" +
    "ip-country/geo-ip.pl?action=download"
    !
    db = SQLite3::Database.new(":memory:")
    !
    # ... database loading unchanged ...
    !
    stmt = db.prepare(<<-END_SELECT.strip)
    SELECT country FROM ips WHERE low_ip <= :ip AND :ip <= high_ip
    END_SELECT
    loop do
    print "IP address? "
    ip = gets.to_s.strip
    if ip =~ /\S/
    ip_int = ip.split(".").map { |n| Integer(n) }.
    pack("C*").unpack("N").first
    puts stmt.execute(:ip => ip_int).each { |c| break c[0] } || "Unknown"
    else
    break
    end
    end

    View Slide

  115. In-Memory
    #!/usr/bin/env ruby -KU
    !
    # ... requires unchanged ...
    !
    REMOTE_DB = "http://software77.net/cgi-bin/" +
    "ip-country/geo-ip.pl?action=download"
    !
    db = SQLite3::Database.new(":memory:")
    !
    # ... database loading unchanged ...
    !
    stmt = db.prepare(<<-END_SELECT.strip)
    SELECT country FROM ips WHERE low_ip <= :ip AND :ip <= high_ip
    END_SELECT
    loop do
    print "IP address? "
    ip = gets.to_s.strip
    if ip =~ /\S/
    ip_int = ip.split(".").map { |n| Integer(n) }.
    pack("C*").unpack("N").first
    puts stmt.execute(:ip => ip_int).each { |c| break c[0] } || "Unknown"
    else
    break
    end
    end

    View Slide

  116. Attach and
    Functions

    View Slide

  117. Attach and
    Functions
    #!/usr/bin/env ruby -KU
    !
    require "rubygems"
    require "sqlite3"
    !
    user = ARGV.shift or
    abort "Usage: #{File.basename($PROGRAM_NAME)} USER"
    db = SQLite3::Database.new("users.sqlite")
    db.execute("ATTACH DATABASE 'country_ips.sqlite' AS country_ips")
    !
    db.create_function("IP2INT", 1) do |func, ip|
    func.result = ip.to_s.split(".").map { |n| Integer(n) }.
    pack("C*").unpack("N").first
    end
    !
    sql = <SELECT users.name, users.ip, ips.country
    FROM users INNER JOIN ips
    ON ips.low_ip <= IP2INT(users.ip) AND
    IP2INT(users.ip) <= ips.high_ip
    WHERE users.name LIKE ? LIMIT 1
    END_SQL
    puts db.get_first_row(sql, "%#{user}%").join(", ")

    View Slide

  118. Attach and
    Functions
    #!/usr/bin/env ruby -KU
    !
    require "rubygems"
    require "sqlite3"
    !
    user = ARGV.shift or
    abort "Usage: #{File.basename($PROGRAM_NAME)} USER"
    db = SQLite3::Database.new("users.sqlite")
    db.execute("ATTACH DATABASE 'country_ips.sqlite' AS country_ips")
    !
    db.create_function("IP2INT", 1) do |func, ip|
    func.result = ip.to_s.split(".").map { |n| Integer(n) }.
    pack("C*").unpack("N").first
    end
    !
    sql = <SELECT users.name, users.ip, ips.country
    FROM users INNER JOIN ips
    ON ips.low_ip <= IP2INT(users.ip) AND
    IP2INT(users.ip) <= ips.high_ip
    WHERE users.name LIKE ? LIMIT 1
    END_SQL
    puts db.get_first_row(sql, "%#{user}%").join(", ")

    View Slide

  119. Attach and
    Functions
    #!/usr/bin/env ruby -KU
    !
    require "rubygems"
    require "sqlite3"
    !
    user = ARGV.shift or
    abort "Usage: #{File.basename($PROGRAM_NAME)} USER"
    db = SQLite3::Database.new("users.sqlite")
    db.execute("ATTACH DATABASE 'country_ips.sqlite' AS country_ips")
    !
    db.create_function("IP2INT", 1) do |func, ip|
    func.result = ip.to_s.split(".").map { |n| Integer(n) }.
    pack("C*").unpack("N").first
    end
    !
    sql = <SELECT users.name, users.ip, ips.country
    FROM users INNER JOIN ips
    ON ips.low_ip <= IP2INT(users.ip) AND
    IP2INT(users.ip) <= ips.high_ip
    WHERE users.name LIKE ? LIMIT 1
    END_SQL
    puts db.get_first_row(sql, "%#{user}%").join(", ")

    View Slide

  120. Attach and
    Functions
    #!/usr/bin/env ruby -KU
    !
    require "rubygems"
    require "sqlite3"
    !
    user = ARGV.shift or
    abort "Usage: #{File.basename($PROGRAM_NAME)} USER"
    db = SQLite3::Database.new("users.sqlite")
    db.execute("ATTACH DATABASE 'country_ips.sqlite' AS country_ips")
    !
    db.create_function("IP2INT", 1) do |func, ip|
    func.result = ip.to_s.split(".").map { |n| Integer(n) }.
    pack("C*").unpack("N").first
    end
    !
    sql = <SELECT users.name, users.ip, ips.country
    FROM users INNER JOIN ips
    ON ips.low_ip <= IP2INT(users.ip) AND
    IP2INT(users.ip) <= ips.high_ip
    WHERE users.name LIKE ? LIMIT 1
    END_SQL
    puts db.get_first_row(sql, "%#{user}%").join(", ")

    View Slide

  121. A Binary Tree

    View Slide

  122. A Binary Tree
    RBTree

    View Slide

  123. Sometimes you Just
    Need the big Guns

    View Slide

  124. Sometimes you Just
    Need the big Guns
    ‣Binary search
    and binary trees
    are pretty big
    guns in
    computing

    View Slide

  125. Sometimes you Just
    Need the big Guns
    ‣Binary search
    and binary trees
    are pretty big
    guns in
    computing
    ‣RBTree provides
    a super efficient
    binary tree
    implementation

    View Slide

  126. Sometimes you Just
    Need the big Guns
    ‣Binary search
    and binary trees
    are pretty big
    guns in
    computing
    ‣RBTree provides
    a super efficient
    binary tree
    implementation
    ‣It’s written in C

    View Slide

  127. Another Solution:
    IP to Country

    View Slide

  128. Another Solution:
    IP to Country
    ‣This time we will use a real binary
    search

    View Slide

  129. Another Solution:
    IP to Country
    ‣This time we will use a real binary
    search
    ‣But we don’t have to write it

    View Slide

  130. Another Solution:
    IP to Country
    ‣This time we will use a real binary
    search
    ‣But we don’t have to write it
    ‣This drops the search time below 1/1,000th of
    a second

    View Slide

  131. Another Solution:
    IP to Country
    ‣This time we will use a real binary
    search
    ‣But we don’t have to write it
    ‣This drops the search time below 1/1,000th of
    a second
    ‣We will Marshal RBTree to build our
    persistent database

    View Slide

  132. Another Solution:
    IP to Country
    ‣This time we will use a real binary
    search
    ‣But we don’t have to write it
    ‣This drops the search time below 1/1,000th of
    a second
    ‣We will Marshal RBTree to build our
    persistent database
    ‣We will use RBTree’s bounds search
    methods to perform the search

    View Slide

  133. The Same Setup

    View Slide

  134. The Same Setup
    #!/usr/bin/env ruby -wKU
    !
    require "open-uri"
    require "zlib"
    !
    require "rubygems"
    require "faster_csv"
    require "rbtree"
    !
    REMOTE_DB = "http://software77.net/cgi-bin/" +
    "ip-country/geo-ip.pl?action=download"
    LOCAL_DB = "country_ips.marshal"
    !
    File.unlink(LOCAL_DB) if ARGV.delete("-r") and
    File.exist? LOCAL_DB
    !
    # ...

    View Slide

  135. An Easier Load

    View Slide

  136. An Easier Load
    # ...
    !
    unless File.exist? LOCAL_DB
    ips = RBTree.new
    open(REMOTE_DB) do |url|
    Zlib::GzipReader.new(url).each do |line|
    next if line =~ /\A\s*(?:#|\z)/
    low, high, country = FCSV.parse_line(line).values_at(0..1, 6)
    ips[Integer(low)] = [Integer(high), country]
    end
    end
    File.open(LOCAL_DB, "wb") { |file| Marshal.dump(ips, file) }
    end
    !
    # ...

    View Slide

  137. An Easier Load
    # ...
    !
    unless File.exist? LOCAL_DB
    ips = RBTree.new
    open(REMOTE_DB) do |url|
    Zlib::GzipReader.new(url).each do |line|
    next if line =~ /\A\s*(?:#|\z)/
    low, high, country = FCSV.parse_line(line).values_at(0..1, 6)
    ips[Integer(low)] = [Integer(high), country]
    end
    end
    File.open(LOCAL_DB, "wb") { |file| Marshal.dump(ips, file) }
    end
    !
    # ...

    View Slide

  138. An Easier Load
    # ...
    !
    unless File.exist? LOCAL_DB
    ips = RBTree.new
    open(REMOTE_DB) do |url|
    Zlib::GzipReader.new(url).each do |line|
    next if line =~ /\A\s*(?:#|\z)/
    low, high, country = FCSV.parse_line(line).values_at(0..1, 6)
    ips[Integer(low)] = [Integer(high), country]
    end
    end
    File.open(LOCAL_DB, "wb") { |file| Marshal.dump(ips, file) }
    end
    !
    # ...

    View Slide

  139. A Much
    Faster Search

    View Slide

  140. A Much
    Faster Search
    # ...
    !
    ips = File.open(LOCAL_DB, "rb") { |file| Marshal.load(file) }
    loop do
    print "IP address? "
    ip = gets.to_s.strip
    if ip =~ /\S/
    ip_int = ip.split(".").map { |n| Integer(n) }.
    pack("C*").unpack("N").first
    match = ips.upper_bound(ip_int)
    puts match && ip_int <= match.last.first ? match.last.last :
    "Unknown"
    else
    break
    end
    end

    View Slide

  141. A Much
    Faster Search
    # ...
    !
    ips = File.open(LOCAL_DB, "rb") { |file| Marshal.load(file) }
    loop do
    print "IP address? "
    ip = gets.to_s.strip
    if ip =~ /\S/
    ip_int = ip.split(".").map { |n| Integer(n) }.
    pack("C*").unpack("N").first
    match = ips.upper_bound(ip_int)
    puts match && ip_int <= match.last.first ? match.last.last :
    "Unknown"
    else
    break
    end
    end

    View Slide

  142. A Much
    Faster Search
    # ...
    !
    ips = File.open(LOCAL_DB, "rb") { |file| Marshal.load(file) }
    loop do
    print "IP address? "
    ip = gets.to_s.strip
    if ip =~ /\S/
    ip_int = ip.split(".").map { |n| Integer(n) }.
    pack("C*").unpack("N").first
    match = ips.upper_bound(ip_int)
    puts match && ip_int <= match.last.first ? match.last.last :
    "Unknown"
    else
    break
    end
    end

    View Slide

  143. Other Nice
    Features

    View Slide

  144. Other Nice
    Features
    ‣RBTree is pretty much a drop in
    replacement for a Hash you want to
    keep ordered by keys

    View Slide

  145. Other Nice
    Features
    ‣RBTree is pretty much a drop in
    replacement for a Hash you want to
    keep ordered by keys
    ‣Just having RBTree available magically
    speeds up Ruby SortedSet (over 15
    times faster for simple iteration) in the
    standard library

    View Slide

  146. An Ordered Hash

    View Slide

  147. An Ordered Hash
    #!/usr/bin/env ruby -wKU
    !
    require "rubygems"
    require "rbtree"
    !
    ordered_hash = RBTree.new
    !
    ordered_hash[2] = "two"
    ordered_hash[1] = "one"
    ordered_hash[3] = "three"
    !
    ordered_hash.each do |key, value|
    puts "#{key}: #{value}"
    end
    # >> 1: one
    # >> 2: two
    # >> 3: three

    View Slide

  148. Magically
    Improving
    SortedSet

    View Slide

  149. Magically
    Improving
    SortedSet
    #!/usr/bin/env ruby -wKU
    !
    # pass an argument, like --rbtree, and this sets up the load
    require "rubygems" unless ARGV.empty?
    !
    require "set"
    dictionary = SortedSet.new
    File.foreach("/usr/share/dict/words") do |word|
    dictionary << word.strip if word =~ /\S/
    end
    !
    start = Time.now
    dictionary.to_a # force the set into order
    puts "Time to order: #{Time.now - start}"

    View Slide

  150. The Filesystem as a Hash

    View Slide

  151. The Filesystem as a Hash
    FSDB

    View Slide

  152. Stay Flexible

    View Slide

  153. Stay Flexible
    ‣Data can be in
    many different
    formats and
    related in many
    different ways

    View Slide

  154. Stay Flexible
    ‣Data can be in
    many different
    formats and
    related in many
    different ways
    ‣FSDB gives you
    a lot of
    flexibility in
    these areas

    View Slide

  155. Stay Flexible
    ‣Data can be in
    many different
    formats and
    related in many
    different ways
    ‣FSDB gives you
    a lot of
    flexibility in
    these areas
    ‣Get from: http://
    redshift.sourcefor
    ge.net/fsdb/

    View Slide

  156. Problem:
    Server Monitoring

    View Slide

  157. Problem:
    Server Monitoring
    ‣At my job we do a
    lot of server
    monitoring

    View Slide

  158. Problem:
    Server Monitoring
    ‣At my job we do a
    lot of server
    monitoring
    ‣We collect various
    statistics from servers
    at regular intervals

    View Slide

  159. Problem:
    Server Monitoring
    ‣At my job we do a
    lot of server
    monitoring
    ‣We collect various
    statistics from servers
    at regular intervals
    ‣We later analyze this
    data for spikes and
    trends

    View Slide

  160. Problem:
    Server Monitoring
    ‣At my job we do a
    lot of server
    monitoring
    ‣We collect various
    statistics from servers
    at regular intervals
    ‣We later analyze this
    data for spikes and
    trends
    ‣Time Series data is
    one thing RDBMS
    don’t do well

    View Slide

  161. A Solution

    View Slide

  162. A Solution
    ‣Store the data so it is easy to focus in
    on the parts that matter to you now

    View Slide

  163. A Solution
    ‣Store the data so it is easy to focus in
    on the parts that matter to you now
    ‣FSDB is essentially a Hash backed by
    the filesystem

    View Slide

  164. A Solution
    ‣Store the data so it is easy to focus in
    on the parts that matter to you now
    ‣FSDB is essentially a Hash backed by
    the filesystem
    ‣This allows you to use paths to drill down to
    subsets of the data

    View Slide

  165. A Solution
    ‣Store the data so it is easy to focus in
    on the parts that matter to you now
    ‣FSDB is essentially a Hash backed by
    the filesystem
    ‣This allows you to use paths to drill down to
    subsets of the data
    ‣It avoids irrelevant data and even the need for
    an index in some cases

    View Slide

  166. A Solution
    ‣Store the data so it is easy to focus in
    on the parts that matter to you now
    ‣FSDB is essentially a Hash backed by
    the filesystem
    ‣This allows you to use paths to drill down to
    subsets of the data
    ‣It avoids irrelevant data and even the need for
    an index in some cases
    ‣Techniques like this improved our
    graphing speed from almost four
    seconds to well under one

    View Slide

  167. FSDB Structure

    View Slide

  168. FSDB Structure

    View Slide

  169. Creating a
    Database

    View Slide

  170. Creating a
    Database
    #!/usr/bin/env ruby -wKU
    !
    require "fsdb"
    !
    module TimeSeries
    DB = FSDB::Database.new("server_stats/")
    !
    module_function
    def record(data, time = Time.now)
    DB[time.strftime("%Y/%m/%d/%H/%M.obj")] = data
    end
    # ...

    View Slide

  171. Creating a
    Database
    #!/usr/bin/env ruby -wKU
    !
    require "fsdb"
    !
    module TimeSeries
    DB = FSDB::Database.new("server_stats/")
    !
    module_function
    def record(data, time = Time.now)
    DB[time.strftime("%Y/%m/%d/%H/%M.obj")] = data
    end
    # ...

    View Slide

  172. Creating a
    Database
    #!/usr/bin/env ruby -wKU
    !
    require "fsdb"
    !
    module TimeSeries
    DB = FSDB::Database.new("server_stats/")
    !
    module_function
    def record(data, time = Time.now)
    DB[time.strftime("%Y/%m/%d/%H/%M.obj")] = data
    end
    # ...

    View Slide

  173. Query

    View Slide

  174. Query
    # ...
    def sum(year, *args, &block)
    total = 0
    path = [year, *args][0..5].join("/").gsub(/\b\d\b/, '0\0')
    if File.extname(path) == ".obj"
    total += block[DB[path]]
    else
    (DB[path] || []).each do |new_path|
    total += sum(File.join(path, new_path), &block)
    end
    end
    total
    end
    def average(*args)
    count = 0
    sum(*args) { |data| count += 1; yield data } / count.to_f
    end
    end
    !
    # ...

    View Slide

  175. Query
    # ...
    def sum(year, *args, &block)
    total = 0
    path = [year, *args][0..5].join("/").gsub(/\b\d\b/, '0\0')
    if File.extname(path) == ".obj"
    total += block[DB[path]]
    else
    (DB[path] || []).each do |new_path|
    total += sum(File.join(path, new_path), &block)
    end
    end
    total
    end
    def average(*args)
    count = 0
    sum(*args) { |data| count += 1; yield data } / count.to_f
    end
    end
    !
    # ...

    View Slide

  176. Query
    # ...
    def sum(year, *args, &block)
    total = 0
    path = [year, *args][0..5].join("/").gsub(/\b\d\b/, '0\0')
    if File.extname(path) == ".obj"
    total += block[DB[path]]
    else
    (DB[path] || []).each do |new_path|
    total += sum(File.join(path, new_path), &block)
    end
    end
    total
    end
    def average(*args)
    count = 0
    sum(*args) { |data| count += 1; yield data } / count.to_f
    end
    end
    !
    # ...

    View Slide

  177. Query
    # ...
    def sum(year, *args, &block)
    total = 0
    path = [year, *args][0..5].join("/").gsub(/\b\d\b/, '0\0')
    if File.extname(path) == ".obj"
    total += block[DB[path]]
    else
    (DB[path] || []).each do |new_path|
    total += sum(File.join(path, new_path), &block)
    end
    end
    total
    end
    def average(*args)
    count = 0
    sum(*args) { |data| count += 1; yield data } / count.to_f
    end
    end
    !
    # ...

    View Slide

  178. sample Usage

    View Slide

  179. sample Usage
    # ...
    !
    if __FILE__ == $PROGRAM_NAME
    include TimeSeries
    record( { :load_average => 70,
    :disk_free => 50 },
    Time.local(2008, 9, 4, 12, 0) )
    record( { :load_average => 10,
    :disk_free => 50 },
    Time.local(2008, 9, 4, 12, 30) )
    record( { :load_average => 20,
    :disk_free => 51 },
    Time.local(2008, 9, 4, 13, 0) )
    p average(2008, 9, 4) { |data| data[:load_average] }
    # >> 33.3333333333333
    p average(2008, 9, 4, 12) { |data| data[:load_average] }
    # >> 40.0
    end

    View Slide

  180. Other Nice
    Features

    View Slide

  181. Other Nice
    Features
    ‣FSDB is multi-thread and multi-process
    safe on most platforms

    View Slide

  182. Other Nice
    Features
    ‣FSDB is multi-thread and multi-process
    safe on most platforms
    ‣It supports read only and read/write
    transactions and they can even be
    nested

    View Slide

  183. Other Nice
    Features
    ‣FSDB is multi-thread and multi-process
    safe on most platforms
    ‣It supports read only and read/write
    transactions and they can even be
    nested
    ‣You can define your own formats for
    files

    View Slide

  184. Transactions

    View Slide

  185. Transactions
    #!/usr/bin/env ruby -wKU
    !
    require "fsdb"
    db = FSDB::Database.new("server_stats/")
    !
    # a read only transaction (shared lock)
    db.browse "2008/09/04/12/00.obj" do |data|
    p data[:load_average]
    # >> 70
    p data[:disk_free]
    # >> 50
    end
    !
    # a read/write transaction (exclusive lock)
    db.replace "2008/09/04/12/00.obj" do |data|
    data.merge(:uptime => 21 * 60)
    end
    !
    p db["2008/09/04/12/00.obj"][:uptime]
    # >> 1260

    View Slide

  186. Transactions
    #!/usr/bin/env ruby -wKU
    !
    require "fsdb"
    db = FSDB::Database.new("server_stats/")
    !
    # a read only transaction (shared lock)
    db.browse "2008/09/04/12/00.obj" do |data|
    p data[:load_average]
    # >> 70
    p data[:disk_free]
    # >> 50
    end
    !
    # a read/write transaction (exclusive lock)
    db.replace "2008/09/04/12/00.obj" do |data|
    data.merge(:uptime => 21 * 60)
    end
    !
    p db["2008/09/04/12/00.obj"][:uptime]
    # >> 1260

    View Slide

  187. Transactions
    #!/usr/bin/env ruby -wKU
    !
    require "fsdb"
    db = FSDB::Database.new("server_stats/")
    !
    # a read only transaction (shared lock)
    db.browse "2008/09/04/12/00.obj" do |data|
    p data[:load_average]
    # >> 70
    p data[:disk_free]
    # >> 50
    end
    !
    # a read/write transaction (exclusive lock)
    db.replace "2008/09/04/12/00.obj" do |data|
    data.merge(:uptime => 21 * 60)
    end
    !
    p db["2008/09/04/12/00.obj"][:uptime]
    # >> 1260

    View Slide

  188. A Custom Format

    View Slide

  189. A Custom Format
    #!/usr/bin/env ruby -wKU
    !
    require "fsdb"
    db = FSDB::Database.new("images/")
    !
    PNG_FORMAT = FSDB::Format.new(
    /\.png\z/i, :binary,
    :name => "PNG_FORMAT",
    :load => lambda { |f| f.seek(16)
    w, h = f.read(8).unpack("N2")
    {:width => w, :height => h} },
    :dump => lambda { raise "Read only format." }
    )
    db.formats = [PNG_FORMAT]
    !
    db.browse_each_child "/" do |image, details|
    puts "%s: %p" % [image, details]
    end
    # >> /fsdb_tree.png: {:width=>727, :height=>249}
    # >> /ip_to_country_quiz.png: {:width=>798, :height=>732}
    # >> /scout.png: {:width=>865, :height=>713}

    View Slide

  190. A Custom Format
    #!/usr/bin/env ruby -wKU
    !
    require "fsdb"
    db = FSDB::Database.new("images/")
    !
    PNG_FORMAT = FSDB::Format.new(
    /\.png\z/i, :binary,
    :name => "PNG_FORMAT",
    :load => lambda { |f| f.seek(16)
    w, h = f.read(8).unpack("N2")
    {:width => w, :height => h} },
    :dump => lambda { raise "Read only format." }
    )
    db.formats = [PNG_FORMAT]
    !
    db.browse_each_child "/" do |image, details|
    puts "%s: %p" % [image, details]
    end
    # >> /fsdb_tree.png: {:width=>727, :height=>249}
    # >> /ip_to_country_quiz.png: {:width=>798, :height=>732}
    # >> /scout.png: {:width=>865, :height=>713}

    View Slide

  191. A Custom Format
    #!/usr/bin/env ruby -wKU
    !
    require "fsdb"
    db = FSDB::Database.new("images/")
    !
    PNG_FORMAT = FSDB::Format.new(
    /\.png\z/i, :binary,
    :name => "PNG_FORMAT",
    :load => lambda { |f| f.seek(16)
    w, h = f.read(8).unpack("N2")
    {:width => w, :height => h} },
    :dump => lambda { raise "Read only format." }
    )
    db.formats = [PNG_FORMAT]
    !
    db.browse_each_child "/" do |image, details|
    puts "%s: %p" % [image, details]
    end
    # >> /fsdb_tree.png: {:width=>727, :height=>249}
    # >> /ip_to_country_quiz.png: {:width=>798, :height=>732}
    # >> /scout.png: {:width=>865, :height=>713}

    View Slide

  192. Dirt Simple IPC

    View Slide

  193. Dirt Simple IPC
    Rinda

    View Slide

  194. The Importance
    of Networking

    View Slide

  195. The Importance
    of Networking
    ‣When you need
    more processing
    power, you have
    to start hooking
    CPU’s together

    View Slide

  196. The Importance
    of Networking
    ‣When you need
    more processing
    power, you have
    to start hooking
    CPU’s together
    ‣Rinda can make
    the communication
    between
    processes a snap

    View Slide

  197. Problem:
    Descrambling

    View Slide

  198. Problem:
    Descrambling
    ‣Given some scrambled letters, find all
    dictionary words that can be formed by
    rearranging those letters

    View Slide

  199. Problem:
    Descrambling
    ‣Given some scrambled letters, find all
    dictionary words that can be formed by
    rearranging those letters
    ‣This task represents any task that just needs
    some processing time to sort out

    View Slide

  200. Problem:
    Descrambling
    ‣Given some scrambled letters, find all
    dictionary words that can be formed by
    rearranging those letters
    ‣This task represents any task that just needs
    some processing time to sort out
    ‣All this requires is some I/O and simple
    comparisons

    View Slide

  201. The Trivial
    Solution

    View Slide

  202. The Trivial
    Solution
    #!/usr/bin/env ruby -wKU
    !
    class String
    def signature
    strip.downcase.delete("^a-z").split("").sort.join
    end
    end
    !
    pattern = ARGV.shift.signature
    descrambled = [ ]
    !
    File.foreach("/usr/share/dict/words") do |word|
    descrambled << word if word.signature == pattern
    end
    !
    puts descrambled

    View Slide

  203. The Trivial
    Solution
    #!/usr/bin/env ruby -wKU
    !
    class String
    def signature
    strip.downcase.delete("^a-z").split("").sort.join
    end
    end
    !
    pattern = ARGV.shift.signature
    descrambled = [ ]
    !
    File.foreach("/usr/share/dict/words") do |word|
    descrambled << word if word.signature == pattern
    end
    !
    puts descrambled

    View Slide

  204. Dividing the Work

    View Slide

  205. Dividing the Work
    ‣Heavy processing almost always
    benefits from more processes doing the
    work

    View Slide

  206. Dividing the Work
    ‣Heavy processing almost always
    benefits from more processes doing the
    work
    ‣This is true multiprocessing, unlike Ruby’s
    thread model

    View Slide

  207. Dividing the Work
    ‣Heavy processing almost always
    benefits from more processes doing the
    work
    ‣This is true multiprocessing, unlike Ruby’s
    thread model
    ‣Rinda’s TupleSpace makes the Inter-
    Process Communication super easy

    View Slide

  208. Dividing the Work
    ‣Heavy processing almost always
    benefits from more processes doing the
    work
    ‣This is true multiprocessing, unlike Ruby’s
    thread model
    ‣Rinda’s TupleSpace makes the Inter-
    Process Communication super easy
    ‣This task is I/O bound, but it still
    halved the time to put four processes
    on it

    View Slide

  209. Setup

    View Slide

  210. Setup
    #!/usr/bin/env ruby -wKU
    !
    require "rinda/tuplespace"
    !
    class String
    def signature
    strip.downcase.delete("^a-z").split("").sort.join
    end
    end
    !
    DICT = "/usr/share/dict/words"
    workers = ARGV.first =~ /\A\d+\z/ ? ARGV.shift.to_i : 4
    pattern = ARGV.shift.signature
    chunk_size = File.stat(DICT).size / workers
    !
    # ...

    View Slide

  211. Spawn Workers

    View Slide

  212. Spawn Workers
    # ...
    !
    workers.times do |n|
    fork do
    descrambled = [ ]
    File.open(DICT) do |words|
    my_start = chunk_size * n
    my_end = my_start + chunk_size
    words.seek(my_start)
    words.gets unless my_start.zero?
    words.each do |word|
    descrambled << word if word.signature == pattern
    break if words.pos > my_end
    end
    end
    !
    results = Rinda::TupleSpaceProxy.new(
    DRbObject.new_with_uri("druby://localhost:61676")
    )
    results.write([pattern, descrambled])
    end
    end
    !
    # ...

    View Slide

  213. Spawn Workers
    # ...
    !
    workers.times do |n|
    fork do
    descrambled = [ ]
    File.open(DICT) do |words|
    my_start = chunk_size * n
    my_end = my_start + chunk_size
    words.seek(my_start)
    words.gets unless my_start.zero?
    words.each do |word|
    descrambled << word if word.signature == pattern
    break if words.pos > my_end
    end
    end
    !
    results = Rinda::TupleSpaceProxy.new(
    DRbObject.new_with_uri("druby://localhost:61676")
    )
    results.write([pattern, descrambled])
    end
    end
    !
    # ...

    View Slide

  214. Collect Results

    View Slide

  215. Collect Results
    # ...
    !
    results = Rinda::TupleSpace.new
    DRb.start_service("druby://localhost:61676", results)
    workers.times do
    descrambled = results.take([/\b#{Regexp.escape(pattern)}\b/, Array])
    puts descrambled.last
    end
    Process.waitall

    View Slide

  216. Collect Results
    # ...
    !
    results = Rinda::TupleSpace.new
    DRb.start_service("druby://localhost:61676", results)
    workers.times do
    descrambled = results.take([/\b#{Regexp.escape(pattern)}\b/, Array])
    puts descrambled.last
    end
    Process.waitall

    View Slide

  217. Collect Results
    # ...
    !
    results = Rinda::TupleSpace.new
    DRb.start_service("druby://localhost:61676", results)
    workers.times do
    descrambled = results.take([/\b#{Regexp.escape(pattern)}\b/, Array])
    puts descrambled.last
    end
    Process.waitall

    View Slide

  218. Other Nice
    Features

    View Slide

  219. Other Nice
    Features
    ‣You can set expiration times for tuples
    added to a TupleSpace

    View Slide

  220. Other Nice
    Features
    ‣You can set expiration times for tuples
    added to a TupleSpace
    ‣Rinda also comes with a RingServer for
    zero configuration networking

    View Slide

  221. Using RingServer

    View Slide

  222. Using RingServer
    #!/usr/bin/env ruby -wKU
    !
    require "rinda/ring" # for RingServer
    require "rinda/tuplespace" # for TupleSpace
    !
    # start a RingServer
    DRb.start_service
    Rinda::RingServer.new(Rinda::TupleSpace.new)
    !
    # ...

    View Slide

  223. Using RingServer
    #!/usr/bin/env ruby -wKU
    !
    require "rinda/ring" # for RingServer
    require "rinda/tuplespace" # for TupleSpace
    !
    # start a RingServer
    DRb.start_service
    Rinda::RingServer.new(Rinda::TupleSpace.new)
    !
    # ...

    View Slide

  224. Using RingServer
    #!/usr/bin/env ruby -wKU
    !
    require "rinda/ring" # for RingServer
    require "rinda/tuplespace" # for TupleSpace
    !
    # start a RingServer
    DRb.start_service
    Rinda::RingServer.new(Rinda::TupleSpace.new)
    !
    # ...
    #!/usr/bin/env ruby -wKU
    !
    require "rinda/ring" # for RingFinger
    require "rinda/tuplespace" # for TupleSpace
    !
    # find a RingServer
    DRb.start_service
    ring_server = Rinda::RingFinger.primary
    !
    # ...

    View Slide

  225. Questions?

    View Slide