Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Understanding Unix pipes with Ruby

Sergio Gil
October 14, 2016

Understanding Unix pipes with Ruby

Slides of my talk at Conferencia Rails 2016. The examples can be found at https://github.com/porras/pipes

Sergio Gil

October 14, 2016
Tweet

More Decks by Sergio Gil

Other Decks in Programming

Transcript

  1. ls -l total 75096 -rw-r--r-- 1 sergio staff 235 Sep

    29 00:30 -.8x -rw-r--r-- 1 sergio staff 235 Sep 29 00:31 -rwxr-xr-x -rw-r--r-- 1 sergio staff 1282 Sep 4 2015 BSDL -rw-r--r-- 1 sergio staff 203 Sep 4 2015 CONTRIBUTING.md -rw-r--r-- 1 sergio staff 2502 Sep 29 00:29 COPYING -rw-r--r-- 1 sergio staff 2624 Sep 4 2015 COPYING.ja -rw-r--r-- 1 sergio staff 312287 Oct 11 08:05 ChangeLog -rw-r--r-- 1 sergio staff 218 Sep 29 00:39 GNUmakefile -rw-r--r-- 1 sergio staff 18092 Sep 4 2015 GPL -rw-r--r-- 1 sergio staff 196 Sep 29 00:29 KNOWNBUGS.rb -rw-r--r-- 1 sergio staff 29083 Sep 29 00:29 LEGAL -rw-r--r-- 1 sergio staff 17402 Sep 29 00:39 Makefile -rw-r--r-- 1 sergio staff 16520 Sep 29 00:29 Makefile.in -rw-r--r-- 1 sergio staff 7196 Oct 11 08:05 NEWS -rw-r--r-- 1 sergio staff 28 Sep 4 2015 README.EXT -rw-r--r-- 1 sergio staff 43 Sep 4 2015 README.EXT.ja
  2. ls -l | grep ^-........x -rwxr-xr-x 1 sergio staff 33271

    Sep 29 00:39 config.status -rwxr-xr-x 1 sergio staff 731437 Sep 29 00:33 configure -rwxr-xr-x 1 sergio staff 3166548 Sep 29 00:40 miniruby -rwxr-xr-x 1 sergio staff 3164796 Sep 29 00:45 ruby
  3. +---------+ |PROCESS 1| +---------+ err| |out | | | |

    | |in | +-v-------+ | |PROCESS 2| | +---------+ | err| |out | | | | | | | | | +-v-----v-----v-+ | TERMINAL | +---------------+
  4. +---------+ |PROCESS 1| +---------+ err| |out | | | |

    | |in | +-v-------+ | |PROCESS 2| | +---------+ | err| |out | | | | | | | | | +-v-----v-----v-+ | TERMINAL | +---------------+ prog1 | prog2
  5. +---------+ |PROCESS 1| +---------+ err| |out | | | |

    | |in | +-v-------+ | |PROCESS 2| | +---------+ | err| |out | | | | | | | | | +-v-----v-----v-+ | TERMINAL | +---------------+ ruby prog1.rb | ruby prog2.rb
  6. +---------+ |PROCESS 1| +---------+ err| |out | | | |

    | |in | +-v-------+ | |PROCESS 2| | +---------+ | err| |out | | | | | | | | | +-v-----v-----v-+ | TERMINAL | +---------------+ ruby prog1.rb | ruby prog2.rb (1..5).each do |i| $stderr.puts "Sending #{i}" $stdout.puts i end $stderr.puts "Done sending"
  7. +---------+ |PROCESS 1| +---------+ err| |out | | | |

    | |in | +-v-------+ | |PROCESS 2| | +---------+ | err| |out | | | | | | | | | +-v-----v-----v-+ | TERMINAL | +---------------+ ruby prog1.rb | ruby prog2.rb (1..5).each do |i| $stderr.puts "Sending #{i}" $stdout.puts i end $stderr.puts "Done sending" $stdin.each do |line| puts "Received #{line}" end puts "Done receiving"
  8. Sending 1 Sending 2 Sending 3 Sending 4 Sending 5

    Done sending Received 1 Received 2 Received 3 Received 4 Received 5 Done receiving
  9. Sending 1 Sending 2 Sending 3 Sending 4 Sending 5

    Done sending Received 1 Received 2 Received 3 Received 4 Received 5 Done receiving Sending 1 Received 1 Sending 2 Received 2 Sending 3 Received 3 Sending 4 Received 4 Sending 5 Received 5 Done sending Done receiving
  10. def ls(options = {}) Enumerator.new do |e| open("| ls #{'-l'

    if options[:long]}").each do |l| puts "Reading #{l.inspect}" e << l end end end
  11. def grep(input, regex) Enumerator.new do |e| input.each do |l| puts

    "Filtering #{l.inspect}" e << l if l =~ regex end end end
  12. def count(input) total = 0 input.each do |l| puts "Counting

    #{l.inspect}" total += 1 end total end
  13. count(grep(ls(long: true), /^-.{8}x/)) Reading "total 24\n" Filtering "total 24\n" Reading

    "-rw-r--r-- 1 sergio staff 260 Oct 7 00:34 1.rb\n" Filtering "-rw-r--r-- 1 sergio staff 260 Oct 7 00:34 1.rb\n"
  14. count(grep(ls(long: true), /^-.{8}x/)) Reading "total 24\n" Filtering "total 24\n" Reading

    "-rw-r--r-- 1 sergio staff 260 Oct 7 00:34 1.rb\n" Filtering "-rw-r--r-- 1 sergio staff 260 Oct 7 00:34 1.rb\n" Reading "-rw-r--r-- 1 sergio staff 525 Oct 7 00:40 2.rb\n"
  15. count(grep(ls(long: true), /^-.{8}x/)) Reading "total 24\n" Filtering "total 24\n" Reading

    "-rw-r--r-- 1 sergio staff 260 Oct 7 00:34 1.rb\n" Filtering "-rw-r--r-- 1 sergio staff 260 Oct 7 00:34 1.rb\n" Reading "-rw-r--r-- 1 sergio staff 525 Oct 7 00:40 2.rb\n" Filtering "-rw-r--r-- 1 sergio staff 525 Oct 7 00:40 2.rb\n"
  16. count(grep(ls(long: true), /^-.{8}x/)) Reading "total 24\n" Filtering "total 24\n" Reading

    "-rw-r--r-- 1 sergio staff 260 Oct 7 00:34 1.rb\n" Filtering "-rw-r--r-- 1 sergio staff 260 Oct 7 00:34 1.rb\n" Reading "-rw-r--r-- 1 sergio staff 525 Oct 7 00:40 2.rb\n" Filtering "-rw-r--r-- 1 sergio staff 525 Oct 7 00:40 2.rb\n" Reading "-rw-r--r-- 1 sergio staff 0 Oct 7 00:22 3.rb\n"
  17. count(grep(ls(long: true), /^-.{8}x/)) Reading "total 24\n" Filtering "total 24\n" Reading

    "-rw-r--r-- 1 sergio staff 260 Oct 7 00:34 1.rb\n" Filtering "-rw-r--r-- 1 sergio staff 260 Oct 7 00:34 1.rb\n" Reading "-rw-r--r-- 1 sergio staff 525 Oct 7 00:40 2.rb\n" Filtering "-rw-r--r-- 1 sergio staff 525 Oct 7 00:40 2.rb\n" Reading "-rw-r--r-- 1 sergio staff 0 Oct 7 00:22 3.rb\n" Filtering "-rw-r--r-- 1 sergio staff 0 Oct 7 00:22 3.rb\n"
  18. count(grep(ls(long: true), /^-.{8}x/)) Reading "total 24\n" Filtering "total 24\n" Reading

    "-rw-r--r-- 1 sergio staff 260 Oct 7 00:34 1.rb\n" Filtering "-rw-r--r-- 1 sergio staff 260 Oct 7 00:34 1.rb\n" Reading "-rw-r--r-- 1 sergio staff 525 Oct 7 00:40 2.rb\n" Filtering "-rw-r--r-- 1 sergio staff 525 Oct 7 00:40 2.rb\n" Reading "-rw-r--r-- 1 sergio staff 0 Oct 7 00:22 3.rb\n" Filtering "-rw-r--r-- 1 sergio staff 0 Oct 7 00:22 3.rb\n" Reading "-rwxr-xr-x 1 sergio staff 22 Oct 7 00:26 dummmy\n"
  19. count(grep(ls(long: true), /^-.{8}x/)) Reading "total 24\n" Filtering "total 24\n" Reading

    "-rw-r--r-- 1 sergio staff 260 Oct 7 00:34 1.rb\n" Filtering "-rw-r--r-- 1 sergio staff 260 Oct 7 00:34 1.rb\n" Reading "-rw-r--r-- 1 sergio staff 525 Oct 7 00:40 2.rb\n" Filtering "-rw-r--r-- 1 sergio staff 525 Oct 7 00:40 2.rb\n" Reading "-rw-r--r-- 1 sergio staff 0 Oct 7 00:22 3.rb\n" Filtering "-rw-r--r-- 1 sergio staff 0 Oct 7 00:22 3.rb\n" Reading "-rwxr-xr-x 1 sergio staff 22 Oct 7 00:26 dummmy\n" Filtering "-rwxr-xr-x 1 sergio staff 22 Oct 7 00:26 dummmy\n"
  20. count(grep(ls(long: true), /^-.{8}x/)) Reading "total 24\n" Filtering "total 24\n" Reading

    "-rw-r--r-- 1 sergio staff 260 Oct 7 00:34 1.rb\n" Filtering "-rw-r--r-- 1 sergio staff 260 Oct 7 00:34 1.rb\n" Reading "-rw-r--r-- 1 sergio staff 525 Oct 7 00:40 2.rb\n" Filtering "-rw-r--r-- 1 sergio staff 525 Oct 7 00:40 2.rb\n" Reading "-rw-r--r-- 1 sergio staff 0 Oct 7 00:22 3.rb\n" Filtering "-rw-r--r-- 1 sergio staff 0 Oct 7 00:22 3.rb\n" Reading "-rwxr-xr-x 1 sergio staff 22 Oct 7 00:26 dummmy\n" Filtering "-rwxr-xr-x 1 sergio staff 22 Oct 7 00:26 dummmy\n" Counting "-rwxr-xr-x 1 sergio staff 22 Oct 7 00:26 dummmy\n"
  21. count(grep(ls(long: true), /^-.{8}x/)) Reading "total 24\n" Filtering "total 24\n" Reading

    "-rw-r--r-- 1 sergio staff 260 Oct 7 00:34 1.rb\n" Filtering "-rw-r--r-- 1 sergio staff 260 Oct 7 00:34 1.rb\n" Reading "-rw-r--r-- 1 sergio staff 525 Oct 7 00:40 2.rb\n" Filtering "-rw-r--r-- 1 sergio staff 525 Oct 7 00:40 2.rb\n" Reading "-rw-r--r-- 1 sergio staff 0 Oct 7 00:22 3.rb\n" Filtering "-rw-r--r-- 1 sergio staff 0 Oct 7 00:22 3.rb\n" Reading "-rwxr-xr-x 1 sergio staff 22 Oct 7 00:26 dummmy\n" Filtering "-rwxr-xr-x 1 sergio staff 22 Oct 7 00:26 dummmy\n" Counting "-rwxr-xr-x 1 sergio staff 22 Oct 7 00:26 dummmy\n"
  22. count(grep(ls(long: true), /^-.{8}x/)) Reading "total 24\n" Filtering "total 24\n" Reading

    "-rw-r--r-- 1 sergio staff 260 Oct 7 00:34 1.rb\n" Filtering "-rw-r--r-- 1 sergio staff 260 Oct 7 00:34 1.rb\n" Reading "-rw-r--r-- 1 sergio staff 525 Oct 7 00:40 2.rb\n" Filtering "-rw-r--r-- 1 sergio staff 525 Oct 7 00:40 2.rb\n" Reading "-rw-r--r-- 1 sergio staff 0 Oct 7 00:22 3.rb\n" Filtering "-rw-r--r-- 1 sergio staff 0 Oct 7 00:22 3.rb\n" Reading "-rwxr-xr-x 1 sergio staff 22 Oct 7 00:26 dummmy\n" Filtering "-rwxr-xr-x 1 sergio staff 22 Oct 7 00:26 dummmy\n" Counting "-rwxr-xr-x 1 sergio staff 22 Oct 7 00:26 dummmy\n" 1
  23. +---------+ |PROCESS 1| +---------+ err| |out | | | |

    | |in | +-v-------+ | |PROCESS 2| | +---------+ | err| |out | | | | | | | | | +-v-----v-----v-+ | TERMINAL | +---------------+ ruby prog1.rb | ruby prog2.rb
  24. +---------+ |PROCESS 1| +---------+ err| |out | | | |

    | |in | +-v-------+ | |PROCESS 2| | +---------+ | err| |out | | | | | | | | | +-v-----v-----v-+ | TERMINAL | +---------------+ ruby prog1.rb | ruby prog2.rb (1..5).each do |i| $stderr.puts "‘Calculating’ #{i}" sleep 1 $stderr.puts "Sending #{i}" $stdout.puts i end $stderr.puts "Done sending"
  25. +---------+ |PROCESS 1| +---------+ err| |out | | | |

    | |in | +-v-------+ | |PROCESS 2| | +---------+ | err| |out | | | | | | | | | +-v-----v-----v-+ | TERMINAL | +---------------+ ruby prog1.rb | ruby prog2.rb (1..5).each do |i| $stderr.puts "‘Calculating’ #{i}" sleep 1 $stderr.puts "Sending #{i}" $stdout.puts i end $stderr.puts "Done sending" $stdin.each do |line| puts "‘Processing’ #{line}" sleep 1 puts "Processed #{line}" end puts "Done receiving"
  26. 5 items 2 steps per item 1 second per step

    Total: 10 seconds 6 seconds
  27. def ls(options = {}) output_queue = Queue.new Thread.new do open("|

    ls #{'-l' if options[:long]}").each do |l| output_queue << l end output_queue.close end output_queue end
  28. def grep(input_queue, regex) output_queue = Queue.new Thread.new do while l

    = input_queue.pop output_queue << l if l =~ regex end output_queue.close end output_queue end
  29. def count(input_queue) total = 0 output_queue = Queue.new Thread.new do

    while l = input_queue.pop total += 1 end output_queue << total output_queue.close end output_queue.pop end
  30. def count(input_queue) total = 0 output_queue = Queue.new Thread.new do

    while l = input_queue.pop total += 1 end output_queue << total output_queue.close end output_queue.pop end count(grep(ls(long: true), /^-.{8}x/))
  31. +-----------+ +-----------+ +-----------+ | | | | | | |

    300/s +-----> 100/s +-----> 100/s | | | | | | | +-----------+ +-----------+ +-----------+
  32. +-----------+ | | +--> 100/s +--+ | | | |

    | +-----------+ | +-----------+ | +-----------+ | +-----------+ | | | | | | | | | 300/s +--+--> 100/s +--+--> 300/s | | | | | | | | | +-----------+ | +-----------+ | +-----------+ | +-----------+ | | | | | +--> 100/s +--+ | | +-----------+
  33. +-----------+ +-----------+ +-----------+ | | | | | | |

    300/s +-----> 100/s +-----> 100/s | | | | | | | +-----------+ +-----------+ +-----------+
  34. +-----------+ +-----------+ +-----------+ | | | | | | |

    300/s +-----> 100/s +-----> 100/s | | | | | | | +-----------+ +-----------+ +-----------+
  35. +-----------+ +-----------+ +-----------+ | | | | | | |

    300/s +-----> 100/s +-----> 100/s | | | | | | | +-----------+ +-----------+ +-----------+ BACK PRESSURE
  36. +-----------+ +-----------+ +-----------+ | | | | | | |

    300/s +-----> 100/s +-----> 100/s | | | | | | | +-----------+ +-----------+ +-----------+ BACK PRESSURE 100
  37. +---------+ |PROCESS 1| +---------+ err| |out | | | |

    | |in | +-v-------+ | |PROCESS 2| | +---------+ | err| |out | | | | | | | | | +-v-----v-----v-+ | TERMINAL | +---------------+ ruby prog1.rb | ruby prog2.rb
  38. +---------+ |PROCESS 1| +---------+ err| |out | | | |

    | |in | +-v-------+ | |PROCESS 2| | +---------+ | err| |out | | | | | | | | | +-v-----v-----v-+ | TERMINAL | +---------------+ ruby prog1.rb | ruby prog2.rb (1..50000).each do |i| $stderr.puts "Sending #{i}" if i % 5000 == 0 $stdout.puts i end $stderr.puts "Done sending"
  39. +---------+ |PROCESS 1| +---------+ err| |out | | | |

    | |in | +-v-------+ | |PROCESS 2| | +---------+ | err| |out | | | | | | | | | +-v-----v-----v-+ | TERMINAL | +---------------+ ruby prog1.rb | ruby prog2.rb (1..50000).each do |i| $stderr.puts "Sending #{i}" if i % 5000 == 0 $stdout.puts i end $stderr.puts "Done sending" $stdin.each do |line| sleep 1.0 / 5000 puts "Processed #{line}" if line.to_i % 5000 == 0 end puts "Done receiving"
  40. Sending 5000 Sending 10000 Sending 15000 Processed 5000 Sending 20000

    Processed 10000 Sending 25000 Processed 15000 Sending 30000 Processed 20000 Sending 35000 Processed 25000 Sending 40000 Processed 30000 Sending 45000 Processed 35000 Sending 50000 Done sending Processed 40000 Processed 45000 Processed 50000 Done receiving