Parallel testing world

Parallel testing world

by Shota Fukumori at the RubyConf2011 second day, 14:10-

626ca235e8dab778c5bad6fc10e94ad8?s=128

Sorah Fukumori

October 02, 2011
Tweet

Transcript

  1. Parallel Testing World Shota Fukumori (sora_h) at RubyConf 2011

  2. Hi • I’m from Japan.

  3. I’m Shota Fukumori A ruby committer.

  4. Also known as sora_h

  5. Committer ranking as of today

  6. Committer ranking as of today me

  7. Committer ranking as of today me The patch monster (also

    known as nobu)
  8. Committer ranking as of today me The patch monster (also

    known as nobu) Where is matz?
  9. Committer ranking as of today me The patch monster (also

    known as nobu) No matz.
  10. Twitter @sora_h

  11. Sarah Mei (not me)

  12. http://sorah.jp/

  13. DISCLAIMER • I’ll talk about parallel testing. • My English

    skill is unknown. • If you want to listen to general news around ruby 1.9.3, I strongly recommend you to move to Room 2. • Room 2 14:10: “Implementation of Ruby 1.9.3 and later” by ko1
  14. DISCLAIMER • All “Ruby” without any notes in this presentation

    mean “CRuby” (a.k.a. MRI).
  15. I missed a schedule of talking at RubyKaigi • 15

    minutes left when I ended a talk
  16. I missed a preparing schedule today • I finished writing

    today’s presentation file yesterday’s night (10pm).
  17. Agenda • Why Parallel Testing • Multi-Thread or Multi-Process •

    test/unit parallelization • How it works • Performance
  18. Agenda • Why Parallel Testing • Multi-Thread or Multi-Process •

    test/unit parallelization • How it works • Performance
  19. Benefits • In most case, tests can run more faster

    • Faster test → Fast development cycle • In TDD, BDD development • At Ruby, committers have to run `make test-all` before commit to repository. • Faster test make developers happy :-)
  20. Develop Test Push Deploy With slow test: Implemented!

  21. Develop Test Push Deploy With slow test: Umm Slow tests....

  22. Develop Test Push Deploy With slow test: --Break time--

  23. Develop Test Push Deploy With slow test: Ugh test failed...

  24. Develop Test Push Deploy With slow test: Fixed!

  25. Develop Test Push Deploy With slow test: Umm Slow tests....

  26. Develop Test Push Deploy With slow test: --Break time--

  27. lol

  28. :(

  29. Develop Test Push Deploy With fast test: Implemented!

  30. Develop Test Push Deploy With fast test: Um, some test

    failed...
  31. Develop Test Push Deploy Fixed! With fast test:

  32. Develop Test Push Deploy Tests passed, yay! With fast test:

  33. Benefits • You can fix failure with short time

  34. Benefits • Q. You should run only few tests (like

    around the fixed point) at once. • A. Some changes make some failures at another point. In ruby, running all test at each commits is recommended.
  35. Example • Cookpad: http://cookpad.com/ • The most popular cooking recipe

    sharing site in Japan • 1.8.7 + Rails 2.3
  36. Example • https://github.com/grosser/parallel_tests • Parallel Testing for RSpec

  37. Example • Before (without parallel_tests): 16 minutes • After (with

    parallel_tests): 3.5 minutes
  38. Example • Cookpad wrote a in-house library • rake cookpad:spec:remote

  39. Example • rake cookpad:spec:remote • 4 servers: • Core i7

    (8 threads) + 16GB RAM + SSD • = 32 threads (!)
  40. Example • rake cookpad:spec:remote • Send local code to remote

    servers for running test with `rsync` • Run test via ssh. Outputs will be merged
  41. Example •parallel_tests (at local machine): 3.5 minutes •cookpad:spec:remote : 50

    seconds •19.2x faster
  42. Example • This example is from: • http://www.slideshare.net/hotchpotch/ ruby01 Video:

    http://vimeo.com/22188723 • Sorry It’s Japanese...
  43. Benefits • Faster test, Faster development, Faster deployment.

  44. Benefits • So, How we can run test faster?

  45. Benefits • Use high power PC • Parallel running

  46. Benefits • Use high power PC • Parallel running

  47. Why parallel testing • Modern machines have a multi (core|

    thread) • Use multi (core|thread) is the easiest way to run test faster!
  48. Agenda • Why Parallel Testing • Multi-Thread or Multi-Process •

    test/unit parallelization • How it works • Performance
  49. Multi-(thread|process) • Currently Ruby’s `Thread` doesn’t run parallel. • So,

    running multiple tests using `Thread` class in Ruby doesn’t make sense.
  50. Multi-`Thread` ? • Ruby’s Thread can’t run multiple threads at

    once... Time Thread 1 Thread 2 Thread 3 ŋŋŋ Running Thread
  51. Multi-Process ? • Multi process is good parallelization method at

    ruby. Time Test 2 Test 1 ŋŋŋ Core A Core B Test 1 Other 1 Other 2 Test 2 ↓Running process on each core Other: other process Test: testing process
  52. Agenda • Why Parallel Testing • Multi-Thread or Multi-Process •

    test/unit parallelization • How it works • Performance
  53. What’s test/unit • Unit testing library that shipped with Ruby

    • Before 1.9: test/unit doesn’t have any dependencies • After 1.9: test/unit is wrapper of minitest because of compatibility. • test/unit is used for Ruby’s unit testing.
  54. test/unit parallelization • I wrote a patch that allows test/unit

    runs multiple tests at once • Because I wanted to make Ruby’s `make test-all` (make target to test ruby) more more faster! :p • Fast is important™
  55. How it works • I’ll use the following words to

    describe how it works: • Master: a process that is started first. This Process sends a instruction to workers. • Example: a ruby process that started by `make test-all` • Worker: a process that is started by Master process. This process runs tests.
  56. How it works 1. Start master process (by like `make

    test- all`) 2. Master process starts worker processes 3. Master process sends a test file names to worker processes 4. Worker processes read a file that named passed by master process
  57. How it works 5. Worker processes run a test. 6.

    Worker processes return a test result to master process. 7. Master process send another test file name to worker process 8. Do from step 4 again.
  58. How it works • this feature uses stdin/stdout for communication

    between worker process and master process. • and this feature parallelize each test file. • so you have to separate tests and TestCase if you use this feature.
  59. How it works Master Worker Worker Test files test_a.rb test_b.rb

    test_c.rb
  60. How it works • User starts master process Master Test

    files test_a.rb test_b.rb test_c.rb
  61. How it works • Master process starts worker processes using

    IO.popen Master Worker Worker Test files test_a.rb test_b.rb test_c.rb IO.popen
  62. How it works • Master process sends file names of

    tests to worker process Master Worker Worker Test files test_a.rb test_b.rb test_c.rb test_a.rb test_b.rb
  63. How it works • Worker process reads test file from

    file system. Master Worker Worker Test files test_a.rb test_b.rb test_c.rb
  64. How it works • Worker process runs test Master Worker

    Worker Test files test_c.rb result result
  65. How it works • Worker process sends result to Master

    process. master saves results from workers. Master Worker Worker Test files test_c.rb result result
  66. How it works • Master sends remaining test file name.

    Master Worker Worker Test files test_c.rb result result test_c.rb
  67. How it works • Worker process runs next test Master

    Worker Worker Test files test_c.rb result result
  68. How it works • If there are no remaining test

    files, send quit instruction to worker Master Worker Worker Test files result result result Quit
  69. How it works • If there are no remaining test

    files, send quit instruction to worker Master Worker Worker Test files result result result Bye
  70. How it works • If there are no remaining test

    files, send quit instruction to worker Master Test files result result result
  71. How it works • If there are any results that

    contains more than one failure, master will retry in default Master result result result I have a failure!
  72. How it works • If there are any results that

    contains more than one failure, it’ll retry by master process in default Master result result test_b.rb Retry
  73. How it works • If there are any results that

    contains more than one failure, it’ll retry by master process in default Master result result result There are no failures.
  74. How it works • Because some tests fail because of

    parallelization. Master result result result There are no failures.
  75. How it works • Then merge the results and show

    to user Master result result result User via STDOUT
  76. This is • all of a mechanism of this this.

    • it works well, but...
  77. In Ruby • some tests don’t consider a parallelization •

    it can guess (I guessed that) • The issues is: • Tests around network are using same port • Tests are using same directory
  78. Real World Issues • test/ruby/test_signal.rb • test/ruby/test_process.rb • test/net/test_{http,https}.rb •

    test/csv/* • tests of rubygems
  79. test_signal.rb • test around signal handling of ruby. • this

    test run Kernel#fork to check fork is implemented. • test/unit’s at_exit called each fork. • I fixed test/unit to make be runnable this test. • Patch description: Add a flag to test/unit and interrupt if a flag is true in at_exit.
  80. test_process.rb • This test modifies STDIN/OUT. • this feature uses

    STDIN/OUT, so I modified to duplicate STDIN/OUT before run test in worker process.
  81. test_{http,https}.rb • These tests use same port number in test.

    • I fixed port number; but a fix is committed before I commit by Yui (nurse).
  82. test/csv/* • Tests included in this directory use same temporally

    directory name • All testcases use same directory name and they deletes temporaly directory in `teardown` phase. • So...
  83. test/csv/* Time test_csv_b.rb test_csv_a.rb Running Tests Test finished. Deleting the

    dir. Still Running...
  84. test/csv/* Time test_csv_b.rb test_csv_a.rb Running Tests Deleted! “No such files

    or directory”
  85. test/csv/* • Renamed temporary directory names to be unique each

    tests in test/csv/*.
  86. test/rubygems/* • Recently they forget writing dependencies of test. •

    It makes problem at parallel_test. • because... ok, let’s show a example. Note: “dependency” is a file should be required.
  87. test/rubygems/* • if there are test_a.rb, test_b.rb, and foo.rb. •

    also test_a.rb and test_b.rb depends on foo.rb. • test_a.rb includes `require ‘foo’`, •but test_b.rb doesn’t.
  88. test/rubygems/* • When not using parallel testing: Time test_a.rb test_b.rb

    Running Test foo.rb is required. no errors. require ‘foo’
  89. test/rubygems/* • When parallel testing: Time test_a.rb test_b.rb Running Tests

    require ‘foo’ Ugh, foo.rb is not required... Exception ocucred...
  90. test/rubygems/* • I’m fixing that if I found. • This

    case can be found in any tests, but I founded only in the rubygems test. • I don’t know why. :p
  91. test/rubygems/* @rubygems_developers: please merge r33232 (in ruby-repo) https://github.com/rubygems/rubygems/issues/180 or http://bit.ly/rubyconf2011_rubygems_issue

  92. Agenda • Why Parallel Testing • Multi-Thread or Multi-Process •

    test/unit parallelization • How it works • Performance
  93. Performance • Kenta Murata a.k.a. @mrkn measured and created a

    performance graphs for me. • He’ll talk at the last session of room 2, about the ruby’s number system. • Thanks!
  94. Performance • Measured at the following environment: • OS: Mac

    OS X 10.6.6 • CPU: Intel Core i7 2.66 GHz • (2 cores 4 threads) • Memory: 8GB 1067 MHz DDR3 • Test files is from ruby’s repository
  95. Performance 0 32.5 65.0 97.5 130.0 single 1 2 3

    5 8 13 Elapsed time Seconds worker(s)
  96. Performance • Without parallelization (“no -j”): 121.49 seconds • Enabled

    parallelization with 5 workers: 43.41 seconds
  97. Performance • Without parallelization (“no -j”): 121.49 seconds • Enabled

    parallelization with 5 workers: 43.41 seconds 2.79x Faster!
  98. How to use • Separate TestCase-s to multiple file. •

    Write script to run multiple test file using Test::Unit::AutoRunner. • good example is ruby’s test/runner.rb in ruby’s repository. • Run the script with argument -j N. • Running the script with --help provides more information.
  99. The patch made me • A committer! • I was

    very happy then.
  100. That’s all. Thank you!

  101. Announcement • Ruby 1.9.3 RC1 has been released! • You

    can use parallelization by installing this! • Let’s try it and tell us if you found a bug. • More about: http://www.ruby-lang.org/
  102. Any Questions? If I can’t answer to your question because

    of my English skill, please send tweet to @sora_h or mail me: sorah@tubusu.net.