‣Created the Ruby Quiz ‣released FasterCSV, HighLine, and Elif ‣Wrote a couple of Pragmatic Programmer books with a lot of Ruby in them ‣I’ve given a talk at every LSRC so far
‣We can always write C extensions for speed-critical portions ‣This is rarely actually needed though ‣We can make use of libraries, some written in C, that help with our problem
‣We can always write C extensions for speed-critical portions ‣This is rarely actually needed though ‣We can make use of libraries, some written in C, that help with our problem ‣We can add more processing power
‣We can always write C extensions for speed-critical portions ‣This is rarely actually needed though ‣We can make use of libraries, some written in C, that help with our problem ‣We can add more processing power ‣We can rework our data structures to better support the task at hand
‣We can always write C extensions for speed-critical portions ‣This is rarely actually needed though ‣We can make use of libraries, some written in C, that help with our problem ‣We can add more processing power ‣We can rework our data structures to better support the task at hand ‣Always the big win, in my opinion
generate images for one project ‣The PPM code was taking about 1.3 seconds for a 400 by 200 pixel image ‣I replaced a two dimensional Array of Color objects with a 3D NArray
generate images for one project ‣The PPM code was taking about 1.3 seconds for a 400 by 200 pixel image ‣I replaced a two dimensional Array of Color objects with a 3D NArray ‣I changed less than ten lines of code
generate images for one project ‣The PPM code was taking about 1.3 seconds for a 400 by 200 pixel image ‣I replaced a two dimensional Array of Color objects with a 3D NArray ‣I changed less than ten lines of code ‣The speed on the same image dropped to about 1/100th of a second
numbers in various sizes ‣Aside from indexing and iteration, NArray supports data generation, arithmetic operations, comparisons, bitwise manipulations, statistic calculations, and more
numbers in various sizes ‣Aside from indexing and iteration, NArray supports data generation, arithmetic operations, comparisons, bitwise manipulations, statistic calculations, and more ‣View the large API with examples At: http://narray.rubyforge.org/
country for that IP ‣this was a Ruby Quiz ‣Solutions were also expected to be efficient in memory and speed ‣This is a real world task I’ve had to do for my job
the file ‣Most of those prepossessed the file to make that search easier ‣I’m going to show a SQLite solution ‣It’s very close to the same speed (about 1/3rd of a second to lookup an IP)
the file ‣Most of those prepossessed the file to make that search easier ‣I’m going to show a SQLite solution ‣It’s very close to the same speed (about 1/3rd of a second to lookup an IP) ‣I didn’t have to be clever or even add an index
the file ‣Most of those prepossessed the file to make that search easier ‣I’m going to show a SQLite solution ‣It’s very close to the same speed (about 1/3rd of a second to lookup an IP) ‣I didn’t have to be clever or even add an index ‣It was easier for me to use full country names
= SQLite3::Database.new(LOCAL_DB) db.execute(<<-END_TABLE.strip) CREATE TABLE ips ( low_ip INTEGER, high_ip INTEGER, country TEXT ) END_TABLE open(REMOTE_DB) do |url| Zlib::GzipReader.new(url).each do |line| next if line =~ /\A\s*(?:#|\z)/ args = FCSV.parse_line(line).values_at(0..1, 6) db.execute(<<-END_INSERT.strip, *args) INSERT INTO ips( low_ip, high_ip, country) VALUES( ?, ?, ?) END_INSERT end end db.close end ! # ...
= SQLite3::Database.new(LOCAL_DB) db.execute(<<-END_TABLE.strip) CREATE TABLE ips ( low_ip INTEGER, high_ip INTEGER, country TEXT ) END_TABLE open(REMOTE_DB) do |url| Zlib::GzipReader.new(url).each do |line| next if line =~ /\A\s*(?:#|\z)/ args = FCSV.parse_line(line).values_at(0..1, 6) db.execute(<<-END_INSERT.strip, *args) INSERT INTO ips( low_ip, high_ip, country) VALUES( ?, ?, ?) END_INSERT end end db.close end ! # ...
= SQLite3::Database.new(LOCAL_DB) db.execute(<<-END_TABLE.strip) CREATE TABLE ips ( low_ip INTEGER, high_ip INTEGER, country TEXT ) END_TABLE open(REMOTE_DB) do |url| Zlib::GzipReader.new(url).each do |line| next if line =~ /\A\s*(?:#|\z)/ args = FCSV.parse_line(line).values_at(0..1, 6) db.execute(<<-END_INSERT.strip, *args) INSERT INTO ips( low_ip, high_ip, country) VALUES( ?, ?, ?) END_INSERT end end db.close end ! # ...
query results in a Hash to index by column name and/or convert column values to Ruby objects based on type ‣It can work with in-memory databases ‣It can run queries across tables in multiple database files
query results in a Hash to index by column name and/or convert column values to Ruby objects based on type ‣It can work with in-memory databases ‣It can run queries across tables in multiple database files ‣You can define SQL functions and aggregates for it in Ruby code
a real binary search ‣But we don’t have to write it ‣This drops the search time below 1/1,000th of a second ‣We will Marshal RBTree to build our persistent database
a real binary search ‣But we don’t have to write it ‣This drops the search time below 1/1,000th of a second ‣We will Marshal RBTree to build our persistent database ‣We will use RBTree’s bounds search methods to perform the search
= RBTree.new open(REMOTE_DB) do |url| Zlib::GzipReader.new(url).each do |line| next if line =~ /\A\s*(?:#|\z)/ low, high, country = FCSV.parse_line(line).values_at(0..1, 6) ips[Integer(low)] = [Integer(high), country] end end File.open(LOCAL_DB, "wb") { |file| Marshal.dump(ips, file) } end ! # ...
= RBTree.new open(REMOTE_DB) do |url| Zlib::GzipReader.new(url).each do |line| next if line =~ /\A\s*(?:#|\z)/ low, high, country = FCSV.parse_line(line).values_at(0..1, 6) ips[Integer(low)] = [Integer(high), country] end end File.open(LOCAL_DB, "wb") { |file| Marshal.dump(ips, file) } end ! # ...
= RBTree.new open(REMOTE_DB) do |url| Zlib::GzipReader.new(url).each do |line| next if line =~ /\A\s*(?:#|\z)/ low, high, country = FCSV.parse_line(line).values_at(0..1, 6) ips[Integer(low)] = [Integer(high), country] end end File.open(LOCAL_DB, "wb") { |file| Marshal.dump(ips, file) } end ! # ...
replacement for a Hash you want to keep ordered by keys ‣Just having RBTree available magically speeds up Ruby SortedSet (over 15 times faster for simple iteration) in the standard library
argument, like --rbtree, and this sets up the load require "rubygems" unless ARGV.empty? ! require "set" dictionary = SortedSet.new File.foreach("/usr/share/dict/words") do |word| dictionary << word.strip if word =~ /\S/ end ! start = Time.now dictionary.to_a # force the set into order puts "Time to order: #{Time.now - start}"
of server monitoring ‣We collect various statistics from servers at regular intervals ‣We later analyze this data for spikes and trends ‣Time Series data is one thing RDBMS don’t do well
focus in on the parts that matter to you now ‣FSDB is essentially a Hash backed by the filesystem ‣This allows you to use paths to drill down to subsets of the data
focus in on the parts that matter to you now ‣FSDB is essentially a Hash backed by the filesystem ‣This allows you to use paths to drill down to subsets of the data ‣It avoids irrelevant data and even the need for an index in some cases
focus in on the parts that matter to you now ‣FSDB is essentially a Hash backed by the filesystem ‣This allows you to use paths to drill down to subsets of the data ‣It avoids irrelevant data and even the need for an index in some cases ‣Techniques like this improved our graphing speed from almost four seconds to well under one
module TimeSeries DB = FSDB::Database.new("server_stats/") ! module_function def record(data, time = Time.now) DB[time.strftime("%Y/%m/%d/%H/%M.obj")] = data end # ...
module TimeSeries DB = FSDB::Database.new("server_stats/") ! module_function def record(data, time = Time.now) DB[time.strftime("%Y/%m/%d/%H/%M.obj")] = data end # ...
module TimeSeries DB = FSDB::Database.new("server_stats/") ! module_function def record(data, time = Time.now) DB[time.strftime("%Y/%m/%d/%H/%M.obj")] = data end # ...
path = [year, *args][0..5].join("/").gsub(/\b\d\b/, '0\0') if File.extname(path) == ".obj" total += block[DB[path]] else (DB[path] || []).each do |new_path| total += sum(File.join(path, new_path), &block) end end total end def average(*args) count = 0 sum(*args) { |data| count += 1; yield data } / count.to_f end end ! # ...
path = [year, *args][0..5].join("/").gsub(/\b\d\b/, '0\0') if File.extname(path) == ".obj" total += block[DB[path]] else (DB[path] || []).each do |new_path| total += sum(File.join(path, new_path), &block) end end total end def average(*args) count = 0 sum(*args) { |data| count += 1; yield data } / count.to_f end end ! # ...
path = [year, *args][0..5].join("/").gsub(/\b\d\b/, '0\0') if File.extname(path) == ".obj" total += block[DB[path]] else (DB[path] || []).each do |new_path| total += sum(File.join(path, new_path), &block) end end total end def average(*args) count = 0 sum(*args) { |data| count += 1; yield data } / count.to_f end end ! # ...
path = [year, *args][0..5].join("/").gsub(/\b\d\b/, '0\0') if File.extname(path) == ".obj" total += block[DB[path]] else (DB[path] || []).each do |new_path| total += sum(File.join(path, new_path), &block) end end total end def average(*args) count = 0 sum(*args) { |data| count += 1; yield data } / count.to_f end end ! # ...
! # a read only transaction (shared lock) db.browse "2008/09/04/12/00.obj" do |data| p data[:load_average] # >> 70 p data[:disk_free] # >> 50 end ! # a read/write transaction (exclusive lock) db.replace "2008/09/04/12/00.obj" do |data| data.merge(:uptime => 21 * 60) end ! p db["2008/09/04/12/00.obj"][:uptime] # >> 1260
! # a read only transaction (shared lock) db.browse "2008/09/04/12/00.obj" do |data| p data[:load_average] # >> 70 p data[:disk_free] # >> 50 end ! # a read/write transaction (exclusive lock) db.replace "2008/09/04/12/00.obj" do |data| data.merge(:uptime => 21 * 60) end ! p db["2008/09/04/12/00.obj"][:uptime] # >> 1260
! # a read only transaction (shared lock) db.browse "2008/09/04/12/00.obj" do |data| p data[:load_average] # >> 70 p data[:disk_free] # >> 50 end ! # a read/write transaction (exclusive lock) db.replace "2008/09/04/12/00.obj" do |data| data.merge(:uptime => 21 * 60) end ! p db["2008/09/04/12/00.obj"][:uptime] # >> 1260
that can be formed by rearranging those letters ‣This task represents any task that just needs some processing time to sort out ‣All this requires is some I/O and simple comparisons
signature strip.downcase.delete("^a-z").split("").sort.join end end ! pattern = ARGV.shift.signature descrambled = [ ] ! File.foreach("/usr/share/dict/words") do |word| descrambled << word if word.signature == pattern end ! puts descrambled
signature strip.downcase.delete("^a-z").split("").sort.join end end ! pattern = ARGV.shift.signature descrambled = [ ] ! File.foreach("/usr/share/dict/words") do |word| descrambled << word if word.signature == pattern end ! puts descrambled
processes doing the work ‣This is true multiprocessing, unlike Ruby’s thread model ‣Rinda’s TupleSpace makes the Inter- Process Communication super easy
processes doing the work ‣This is true multiprocessing, unlike Ruby’s thread model ‣Rinda’s TupleSpace makes the Inter- Process Communication super easy ‣This task is I/O bound, but it still halved the time to put four processes on it