Parallel testing world

Parallel Testing World Shota Fukumori (sora_h) at RubyConf 2011

Hi • I’m from Japan.

I’m Shota Fukumori A ruby committer.

Also known as sora_h

Committer ranking as of today

Committer ranking as of today me

Committer ranking as of today me The patch monster (also
known as nobu)

known as nobu) Where is matz?

known as nobu) No matz.

Twitter @sora_h

Sarah Mei (not me)

http://sorah.jp/

DISCLAIMER • I’ll talk about parallel testing. • My English
skill is unknown. • If you want to listen to general news around ruby 1.9.3, I strongly recommend you to move to Room 2. • Room 2 14:10: “Implementation of Ruby 1.9.3 and later” by ko1

DISCLAIMER • All “Ruby” without any notes in this presentation
mean “CRuby” (a.k.a. MRI).

I missed a schedule of talking at RubyKaigi • 15
minutes left when I ended a talk

I missed a preparing schedule today • I ﬁnished writing
today’s presentation ﬁle yesterday’s night (10pm).

Agenda • Why Parallel Testing • Multi-Thread or Multi-Process •
test/unit parallelization • How it works • Performance

Beneﬁts • In most case, tests can run more faster
• Faster test → Fast development cycle • In TDD, BDD development • At Ruby, committers have to run `make test-all` before commit to repository. • Faster test make developers happy :-)

Develop Test Push Deploy With slow test: Implemented!

Develop Test Push Deploy With slow test: Umm Slow tests....

Develop Test Push Deploy With slow test: --Break time--

Develop Test Push Deploy With slow test: Ugh test failed...

Develop Test Push Deploy With slow test: Fixed!

Develop Test Push Deploy With slow test: Umm Slow tests....

Develop Test Push Deploy With slow test: --Break time--

Develop Test Push Deploy With fast test: Implemented!

Develop Test Push Deploy With fast test: Um, some test
failed...

Develop Test Push Deploy Fixed! With fast test:

Develop Test Push Deploy Tests passed, yay! With fast test:

Beneﬁts • You can ﬁx failure with short time

Beneﬁts • Q. You should run only few tests (like
around the ﬁxed point) at once. • A. Some changes make some failures at another point. In ruby, running all test at each commits is recommended.

Example • Cookpad: http://cookpad.com/ • The most popular cooking recipe
sharing site in Japan • 1.8.7 + Rails 2.3

Example • https://github.com/grosser/parallel_tests • Parallel Testing for RSpec

Example • Before (without parallel_tests): 16 minutes • After (with
parallel_tests): 3.5 minutes

Example • Cookpad wrote a in-house library • rake cookpad:spec:remote

Example • rake cookpad:spec:remote • 4 servers: • Core i7
(8 threads) + 16GB RAM + SSD • = 32 threads (!)

Example • rake cookpad:spec:remote • Send local code to remote
servers for running test with `rsync` • Run test via ssh. Outputs will be merged

Example •parallel_tests (at local machine): 3.5 minutes •cookpad:spec:remote : 50
seconds •19.2x faster

Example • This example is from: • http://www.slideshare.net/hotchpotch/ ruby01 Video:
http://vimeo.com/22188723 • Sorry It’s Japanese...

Beneﬁts • Faster test, Faster development, Faster deployment.

Beneﬁts • So, How we can run test faster?

Beneﬁts • Use high power PC • Parallel running

Why parallel testing • Modern machines have a multi (core|
thread) • Use multi (core|thread) is the easiest way to run test faster!

Multi-(thread|process) • Currently Ruby’s `Thread` doesn’t run parallel. • So,
running multiple tests using `Thread` class in Ruby doesn’t make sense.

Multi-`Thread` ? • Ruby’s Thread can’t run multiple threads at
once... Time Thread 1 Thread 2 Thread 3 ŋŋŋ Running Thread

Multi-Process ? • Multi process is good parallelization method at
ruby. Time Test 2 Test 1 ŋŋŋ Core A Core B Test 1 Other 1 Other 2 Test 2 ↓Running process on each core Other: other process Test: testing process

What’s test/unit • Unit testing library that shipped with Ruby
• Before 1.9: test/unit doesn’t have any dependencies • After 1.9: test/unit is wrapper of minitest because of compatibility. • test/unit is used for Ruby’s unit testing.

test/unit parallelization • I wrote a patch that allows test/unit
runs multiple tests at once • Because I wanted to make Ruby’s `make test-all` (make target to test ruby) more more faster! :p • Fast is important™

How it works • I’ll use the following words to
describe how it works: • Master: a process that is started ﬁrst. This Process sends a instruction to workers. • Example: a ruby process that started by `make test-all` • Worker: a process that is started by Master process. This process runs tests.

How it works 1. Start master process (by like `make
test- all`) 2. Master process starts worker processes 3. Master process sends a test ﬁle names to worker processes 4. Worker processes read a ﬁle that named passed by master process

How it works 5. Worker processes run a test. 6.
Worker processes return a test result to master process. 7. Master process send another test ﬁle name to worker process 8. Do from step 4 again.

How it works • this feature uses stdin/stdout for communication
between worker process and master process. • and this feature parallelize each test ﬁle. • so you have to separate tests and TestCase if you use this feature.

How it works Master Worker Worker Test ﬁles test_a.rb test_b.rb
test_c.rb

How it works • User starts master process Master Test
ﬁles test_a.rb test_b.rb test_c.rb

How it works • Master process starts worker processes using
IO.popen Master Worker Worker Test ﬁles test_a.rb test_b.rb test_c.rb IO.popen

How it works • Master process sends ﬁle names of
tests to worker process Master Worker Worker Test ﬁles test_a.rb test_b.rb test_c.rb test_a.rb test_b.rb

How it works • Worker process reads test file from
file system. Master Worker Worker Test files test_a.rb test_b.rb test_c.rb

How it works • Worker process runs test Master Worker
Worker Test ﬁles test_c.rb result result

How it works • Worker process sends result to Master
process. master saves results from workers. Master Worker Worker Test ﬁles test_c.rb result result

How it works • Master sends remaining test ﬁle name.
Master Worker Worker Test ﬁles test_c.rb result result test_c.rb

How it works • Worker process runs next test Master
Worker Worker Test ﬁles test_c.rb result result

How it works • If there are no remaining test
ﬁles, send quit instruction to worker Master Worker Worker Test ﬁles result result result Quit

ﬁles, send quit instruction to worker Master Worker Worker Test ﬁles result result result Bye

ﬁles, send quit instruction to worker Master Test ﬁles result result result

How it works • If there are any results that
contains more than one failure, master will retry in default Master result result result I have a failure!

contains more than one failure, it’ll retry by master process in default Master result result test_b.rb Retry

contains more than one failure, it’ll retry by master process in default Master result result result There are no failures.

How it works • Because some tests fail because of
parallelization. Master result result result There are no failures.

How it works • Then merge the results and show
to user Master result result result User via STDOUT

This is • all of a mechanism of this this.
• it works well, but...

In Ruby • some tests don’t consider a parallelization •
it can guess (I guessed that) • The issues is: • Tests around network are using same port • Tests are using same directory

Real World Issues • test/ruby/test_signal.rb • test/ruby/test_process.rb • test/net/test_{http,https}.rb •
test/csv/* • tests of rubygems

test_signal.rb • test around signal handling of ruby. • this
test run Kernel#fork to check fork is implemented. • test/unit’s at_exit called each fork. • I fixed test/unit to make be runnable this test. • Patch description: Add a flag to test/unit and interrupt if a flag is true in at_exit.

test_process.rb • This test modiﬁes STDIN/OUT. • this feature uses
STDIN/OUT, so I modiﬁed to duplicate STDIN/OUT before run test in worker process.

test_{http,https}.rb • These tests use same port number in test.
• I ﬁxed port number; but a ﬁx is committed before I commit by Yui (nurse).

test/csv/* • Tests included in this directory use same temporally
directory name • All testcases use same directory name and they deletes temporaly directory in `teardown` phase. • So...

test/csv/* Time test_csv_b.rb test_csv_a.rb Running Tests Test ﬁnished. Deleting the
dir. Still Running...

test/csv/* Time test_csv_b.rb test_csv_a.rb Running Tests Deleted! “No such ﬁles
or directory”

test/csv/* • Renamed temporary directory names to be unique each
tests in test/csv/*.

test/rubygems/* • Recently they forget writing dependencies of test. •
It makes problem at parallel_test. • because... ok, let’s show a example. Note: “dependency” is a ﬁle should be required.

test/rubygems/* • if there are test_a.rb, test_b.rb, and foo.rb. •
also test_a.rb and test_b.rb depends on foo.rb. • test_a.rb includes `require ‘foo’`, •but test_b.rb doesn’t.

test/rubygems/* • When not using parallel testing: Time test_a.rb test_b.rb
Running Test foo.rb is required. no errors. require ‘foo’

test/rubygems/* • When parallel testing: Time test_a.rb test_b.rb Running Tests
require ‘foo’ Ugh, foo.rb is not required... Exception ocucred...

test/rubygems/* • I’m ﬁxing that if I found. • This
case can be found in any tests, but I founded only in the rubygems test. • I don’t know why. :p

test/rubygems/* @rubygems_developers: please merge r33232 (in ruby-repo) https://github.com/rubygems/rubygems/issues/180 or http://bit.ly/rubyconf2011_rubygems_issue

Performance • Kenta Murata a.k.a. @mrkn measured and created a
performance graphs for me. • He’ll talk at the last session of room 2, about the ruby’s number system. • Thanks!

Performance • Measured at the following environment: • OS: Mac
OS X 10.6.6 • CPU: Intel Core i7 2.66 GHz • (2 cores 4 threads) • Memory: 8GB 1067 MHz DDR3 • Test ﬁles is from ruby’s repository

Performance 0 32.5 65.0 97.5 130.0 single 1 2 3
5 8 13 Elapsed time Seconds worker(s)

Performance • Without parallelization (“no -j”): 121.49 seconds • Enabled
parallelization with 5 workers: 43.41 seconds

Performance • Without parallelization (“no -j”): 121.49 seconds • Enabled
parallelization with 5 workers: 43.41 seconds 2.79x Faster!

How to use • Separate TestCase-s to multiple ﬁle. •
Write script to run multiple test ﬁle using Test::Unit::AutoRunner. • good example is ruby’s test/runner.rb in ruby’s repository. • Run the script with argument -j N. • Running the script with --help provides more information.

The patch made me • A committer! • I was
very happy then.

That’s all. Thank you!

Announcement • Ruby 1.9.3 RC1 has been released! • You
can use parallelization by installing this! • Let’s try it and tell us if you found a bug. • More about: http://www.ruby-lang.org/

Any Questions? If I can’t answer to your question because
of my English skill, please send tweet to @sora_h or mail me: [email protected].

Parallel testing world

Parallel testing world

More Decks by Sorah Fukumori

Other Decks in Technology

Featured

Transcript