remore
December 02, 2016
2.1k

# How I made a pure-Ruby word2vec program more than 3x faster

Slides for my talk at RubyConf Taiwan 2016

## remore

December 02, 2016

## Transcript

1. ### How I made a pure-Ruby word2vec program more than 3x

faster RubyConf Taiwan 2016 @remore
2. ### Who Am I Kei Sawada @remore A rubyist from Tokyo"

An weekend contrabassist Engineering Manager at Recruit Holdings Co.,Ltd., VP of Engineering at NIJIBOX Co.,Ltd.
3. ### Me And Taiwan A Taiwanese coworker who is working at

NIJIBOX Many #rubyfriends in Taiwan Eddie, Ryudo, Chao, Yu-Cheng, lulalala, Lin Yu Hsiang and many others Super glad to be here today!
4. ### Who is interested in Ruby’s Performance micro-benchmarking results YARV, ISeq

and profiling tools⏱ Who may be interested in RPC(IPC) with Python and Julia from Ruby This Talk Is Mainly For The Rubyist

6. ### Chapter 1 A Reality A reality of Ruby’s performance for

large-scale computation

8. ### > echo "x=2.5; 1.upto(10){|i| x=x+i}; p x" | time ruby

57.5 0.13 real 0.06 user 0.05 sys 0sec 0.05sec 0.1sec 0.15sec 0.2sec 10 Ruby
9. ### > echo "x=2.5; 1.upto(100){|i| x=x+i}; p x" | time ruby

5052.5 0.11 real 0.06 user 0.04 sys 0.1sec 0.108sec 0.115sec 0.123sec 0.13sec 10 100 Ruby
10. ### > echo "x=2.5; 1.upto(1000){|i| x=x+i}; p x" | time ruby

500502.5 0.15 real 0.07 user 0.05 sys 0sec 0.038sec 0.075sec 0.113sec 0.15sec 10 100 1000 Ruby
11. ### > echo "x=2.5; 1.upto(10000){|i| x=x+i}; p x" | time ruby

50005002.5 0.11 real 0.06 user 0.04 sys 0sec 0.038sec 0.075sec 0.113sec 0.15sec 10 100 1000 10000 Ruby
12. ### > echo "x=2.5; 1.upto(1e5){|i| x=x+i}; p x" | time ruby

5000050002.5 0.14 real 0.08 user 0.05 sys 0sec 0.038sec 0.075sec 0.113sec 0.15sec 10 100 1000 10000 1e5 Ruby

14. ### > echo "x=2.5; 1.upto(1e6){|i| x=x+i}; p x" | time ruby

500000500002.5 0.25 real 0.20 user 0.04 sys 0sec 0.065sec 0.13sec 0.195sec 0.26sec 10 100 1000 10000 1e5 1e6 Ruby
15. ### > echo "x=2.5; 1.upto(1e7){|i| x=x+i}; p x" | time ruby

50000005000002.5 1.58 real 1.52 user 0.05 sys 0sec 0.4sec 0.8sec 1.2sec 1.6sec 10 100 1000 10000 1e5 1e6 1e7 Ruby
16. ### > echo "x=2.5; 1.upto(1e8){|i| x=x+i}; p x" | time ruby

5.000000050000003e+15 14.56 real 14.37 user 0.09 sys 0sec 4sec 8sec 12sec 16sec 10 100 1000 10000 1e5 1e6 1e7 1e8 Ruby
17. ### > echo "x=2.5; 1.upto(1e9){|i| x=x+i}; p x" | time ruby

5.00000000067109e+17 157.27 real 150.16 user 1.30 sys 0sec 40sec 80sec 120sec 160sec 10 100 1000 10000 1e5 1e6 1e7 1e8 1e9 Ruby

loops
19. ### > PY=\$(cat << EOS "n=2.5 for i in range(1,int(\\$N)+1): n=i+n;

print(n)" EOS ) > N=1e3 && eval echo "\$PY" | time python 500502.5 0.10 real 0.01 user 0.01 sys How About Python?
20. ### > N=1e5 && eval echo "\$PY" | time python 5000050002.5

0.13 real 0.03 user 0.01 sys 0sec 0.038sec 0.075sec 0.113sec 0.15sec 10 100 1000 10000 1e5 Ruby Python

of loops!
22. ### > N=1e6 && eval echo "\$PY" | time python 5.00000500002e+11

0.38 real 0.23 user 0.02 sys 0sec 0.1sec 0.2sec 0.3sec 0.4sec 10 100 1000 10000 1e5 1e6 Ruby Python
23. ### > N=1e7 && eval echo "\$PY" | time python 5.0000005e+13

2.66 real 2.35 user 0.17 sys 0sec 0.75sec 1.5sec 2.25sec 3sec 10 100 1000 10000 1e5 1e6 1e7 Ruby Python
24. ### > N=1e8 && eval echo "\$PY" | time python 5.00000005e+15

48.27 real 25.87 user 10.67 sys 0sec 12.5sec 25sec 37.5sec 50sec 10 100 1000 10000 1e5 1e6 1e7 1e8 Ruby Python
25. ### > N=1e9 && eval echo "\$PY" | time python 5.00000005e+15

48.27 real 25.87 user 10.67 sys 0sec 150sec 300sec 450sec 600sec 10 100 1000 10000 1e5 1e6 1e7 1e8 1e9 Ruby Python
26. ### > N=1e9 && eval echo "\$PY" | time python 5.00000005e+15

48.27 real 25.87 user 10.67 sys 0sec 150sec 300sec 450sec 600sec 10 100 1000 10000 1e5 1e6 1e7 1e8 1e9 Ruby Python Attention Please BTW take note that this micro benchmark is done by my MacBook Pro(2015) with Ruby 2.3.0 and Python 2.7. With my environment Python looks pretty slow but it’s never be a fair judge. Please do not take this measurement result seriously, but please just use this to grab the feeling of the order of each programming environment speed!

of loops
28. ### > SRC=\$(cat << EOS "#include \"stdio.h\" int main(){ double n=2.5;

for(int i=1;i<=\\$N;i++){ n=i+n; } printf(\"%lf\", n); }" EOS ) What About C?
29. ### > N=1e5 && eval echo "\$SRC" > main.c; gcc main.c;

time ./a.out 5000050002.500000 real 0m0.006s user 0m0.001s sys 0m0.002s 0sec 0.04sec 0.08sec 0.12sec 0.16sec 10 100 1000 10000 1e5 Ruby Python C

31. ### 0sec 0.1sec 0.2sec 0.3sec 0.4sec 10 100 1000 10000 1e5

1e6 Ruby Python C > N=1e6 && eval echo "\$SRC" > main.c; gcc main.c; time ./a.out 500000500002.500000 real 0m0.009s user 0m0.004s sys 0m0.002s
32. ### 0sec 0.75sec 1.5sec 2.25sec 3sec 10 100 1000 10000 1e5

1e6 1e7 Ruby Python C > N=1e7 && eval echo "\$SRC" > main.c; gcc main.c; time ./a.out 50000005000002.500000 real 0m0.033s user 0m0.029s sys 0m0.002s
33. ### 0sec 12.5sec 25sec 37.5sec 50sec 10 100 1000 10000 1e5

1e6 1e7 1e8 Ruby Python C > N=1e8 && eval echo "\$SRC" > main.c; gcc main.c; time ./a.out 5000000050000003.000000 real 0m0.287s user 0m0.281s sys 0m0.003s
34. ### 0sec 150sec 300sec 450sec 600sec 10 100 1000 10000 1e5

1e6 1e7 1e8 1e9 Ruby Python C > N=1e9 && eval echo "\$SRC" > main.c; gcc main.c; time ./a.out 500000000067108992.000000 real 0m2.815s user 0m2.799s sys 0m0.008s

36. ### Introducing Julia Julia is A dynamic programming language 4 years

old since open sourced in 2012 Desgined for scientific computing Fast
37. ### > JL=\$(cat << EOS "function sample_loop(n) for i in 1:\\$N

n = i+n end n end println(sample_loop(2.5))" EOS ) How About Julia?
38. ### > N=1e5 && eval echo "\$JL" | time julia sample_loop

(generic function with 1 method) 5.0000500025e9 0.91 real 0.48 user 0.14 sys 0sec 0.125sec 0.25sec 0.375sec 0.5sec 10 100 1000 10000 1e5 Ruby Python C Julia

41. ### 0sec 0.125sec 0.25sec 0.375sec 0.5sec 10 100 1000 10000 1e5

1e6 Ruby Python C Julia > N=1e6 && eval echo "\$JL" | time julia sample_loop (generic function with 1 method) 5.000005000025e11 0.45 real 0.44 user 0.08 sys
42. ### 0sec 0.75sec 1.5sec 2.25sec 3sec 10 100 1000 10000 1e5

1e6 1e7 Ruby Python C Julia > N=1e7 && eval echo "\$JL" | time julia sample_loop (generic function with 1 method) 5.00000050000025e13 0.50 real 0.47 user 0.09 sys
43. ### 0sec 12.5sec 25sec 37.5sec 50sec 10 100 1000 10000 1e5

1e6 1e7 1e8 Ruby Python C Julia > N=1e8 && eval echo "\$JL" | time julia sample_loop (generic function with 1 method) 5.000000050000003e15 1.82 real 0.76 user 0.09 sys
44. ### 0sec 150sec 300sec 450sec 600sec 10 100 1000 10000 1e5

1e6 1e7 1e8 1e9 Ruby Python C Julia > N=1e9 && eval echo "\$JL" | time julia sample_loop (generic function with 1 method) 5.00000000067109e17 1.71 real 1.70 user 0.08 sys

loops
46. ### Findings Ruby works reasonably fast for smaller number of loops,

but for huge number of loops it is advisable to consider to switch language Primary option would be using C Julia is also dynamic language but it can be FAST
47. ### Chapter 2 3x Challenge An experiment to workaround this performance

issue using Julia programming language along with ruby2julia transpiler
48. ### Given That Performance Issue, Which Option Is The Best To

Workaround? Give up and use other languages anyway? Make Ruby itself faster? Make a gem to boost my ruby program?
49. ### Idea: Transpiler What if we can run arbitrary ruby code

on a Julia process? It may look something like `some_ruby_code.to_other_lang`

code?
51. ### # ruby for i in 1..N n = i+n end

# julia for i in 1:N n = i+n end Sometimes it's hopeful range operator create range object
52. ### # ruby class Sample def context return self end end

p Sample.new.instance_eval{context} # julia # too many gaps to be filled such as OOP things, methods for reflection etc... But Sometimes it's NOT

promising
54. ### A ruby2julia Transpiler Implementation: Julializer github.com/remore/julializer Very limited syntax is

supported as of v0.1.2 TrueClass, FalseClass, Fixnum, Float, integer, Numeric, Random Array, Range, Hash are also partially supported(only very few methods as of now) TBH still need huge improvements including developing error checking tool and writing documentations "Ͱ΋΍ΔΜͩΑ"
55. ### \$ echo “-1.6.to_i” | julializer trunc(Int64,parse(string((-1.6)))); \$ cat sample.rb for

i in 0..list.size-1 list[i] = (i-list.size/2).abs end \$ julializer sample.rb for i::Int64 = 0:size(list)[1]-1;list[i+1]=abs((i- size(list)[1]/2));;end;; Examples
56. ### \$ ruby -r julializer -e "p Julializer.ruby2julia(File.read('calc.rb'))" "const max_exp=6;;const exp_table_size=1000;;const

max_sentence_length=1000;;function init_unigram_table(table_size, vocab);train_words_pow=0.0;;power=0.75;;table=fill(0, table_size);;for a::Int64 = 0:size(vocab)[1]-1;train_words_pow+=vocab[a+1] [0+1]^power;;end;;i=0;;d1=(vocab[i+1][0+1]^power)/train_words_pow;;for a::Int64 = 0:table_size-1;table[a+1]=i;if a/float(table_size)>d1;i+=1;;d1+=(vocab[i+1] [0+1]^power)/train_words_pow;;;end;if i>=size(vocab)[1];i=size(vocab)[1]-1;;end;;end;;return table;;;end;;function addop(size, list, base, target);for i::Int64 = 0:size-1;list[i+base+1]+=target[i+1];;end;;list;;end;;function addop2(size, list, base, coefficient, target, base2);for i::Int64 = 0:size-1;list[i+base +1]+=coefficient*target[i+base2+1];;end;;list;;end;;function addop3(size, f, coefficient, target, base);for i::Int64 = 0:size-1;f+=coefficient[i+1]*target[i+base +1];;end;;f;;end;;function addop4(size, list, target, base);for i::Int64 = 0:size-1;list[i+1]+=target[i+base+1];;end;;list;;end;;myrandom=0;;function next_random();global myrandom;myrandom=abs((myrandom*25214903917+11));;return myrandom;;;end;;function exptable(num);num=exp((num/ float(exp_table_size)*2-1)*max_exp);;num/(num+1);;end;;function bsearch_index(list, target);a=0;;z=size(list)[1]-1;;while (true);current_entry=list[a+1:z+1] [floor(Int64,((z-a)/2))+1];if current_entry<target;next_entry=list[a+1:z+1][floor(Int64,((z-a)/2+1))+1];;if (next_entry>=target)||z-a<=1;return round(Int64,(a+(z- a)/2+1));;;else;a=round(Int64,(a+(z-a)/2));;;end;;;;else;if a>=target||z-a<=1;return a;;end;;z=round(Int64,(z-(z-a)/2));;;end;;;end;;;end;;function calc_vec(iter, original_text, sample, train_words, debug_mode, __vocab_index_hash, vocab, syn0, syn1neg, negative, alpha, __cum_table, table_size, layer1_size, window);sentence_position=0;;sentence_length=0;;word_count=0;;word_count_actual=0;;last_word_count=0;;sen=[];;local_iter=iter;;neu1=[];;neu1e=[];;backup=copy( original_text);;__denominator=trunc(Int64,parse(string((exp_table_size/max_exp/ 2))));;__sample_train_words=sample*train_words;;table_size=trunc(Int64,parse(string(1e8)));;table=init_unigram_table(table_size,vocab);;starting_alpha=alpha;; while true;if sentence_position%500==0&&debug_mode>1;print(@sprintf(\"%d %d / \",word_count,last_word_count));;end;if word_count- last_word_count>10000;word_count_actual+=word_count-last_word_count;;last_word_count=word_count;;if debug_mode>1;print(string(\"\\r Alpha: \",@sprintf(\"%f\",alpha),\" Progress: \",@sprintf(\"%.2f\",(word_count_actual/float((iter*train_words+1))*100)),\"%\"));;end;;alpha=starting_alpha*(1- word_count_actual/float((iter*train_words+1)));;if alpha<starting_alpha*0.0001;alpha=starting_alpha*0.0001;;end;;;end;if sentence_length==0;skipped=0;;sen=[];;___state = start(original_text);while !done(original_text, ___state);___i, ___state = next(original_text, ___state);e = ___i;if haskey(__vocab_index_hash, e);word=__vocab_index_hash[string(e)];;;else;skipped+=1;;continue;;;end;;;word_count+=1;;if word==0;break;;end;;if sample>0;ran=(sqrt(vocab[word+1][0+1]/__sample_train_words)+1)*__sample_train_words/vocab[word+1][0+1];;if ran<(next_random()&(0xFFFF+0))/ 65536.0;continue;;end;;;end;;push!(sen, word);sentence_length+=1;;if sentence_length>=max_sentence_length;break;;end;;;end;;if max_sentence_length +skipped<=length(original_text)-1;splice!(original_text, 0+1:0+0+max_sentence_length+skipped+1);;else;original_text=[];;;end;;;sentence_position=0;;;end;if size(original_text)[1]==0||word_count>train_words;word_count_actual+=word_count-last_word_count;;local_iter-=1;;if debug_mode>1;print(local_iter);;end;;if local_iter==0;break;;end;;word_count=0;;last_word_count=0;;sentence_length=0;;original_text=copy(backup);;sen=[];;continue;;;end;if sentence_position>=size(sen) [1];continue;;end;word=sen[sentence_position+1];neu1=fill(0.0, layer1_size);neu1e=fill(0.0, layer1_size);b=next_random()%window;cw=0;for j::Int64 = b:window*2- b;if j!=window;k=sentence_position-window+j;;if k<0||k>=sentence_length;continue;;end;;if k>=size(sen)[1];continue;;end;;last_word=sen[k +1];;neu1=addop4(layer1_size,neu1,syn0,last_word*layer1_size);;cw+=1;;;end;;end;if cw!=0;for j::Int64 = 0:layer1_size-1;neu1[j+1]/=cw;;end;;if negative>0;for j::Int64 = 0:negative;if j==0;target=word;;label=1;;;else;nr=next_random();;target=table[(nr>>16)%table_size+1];;if target==0;target=nr%(size(vocab) [1]-1)+1;;end;;if target==word;continue;;end;;label=0;;;end;;l2=target*layer1_size;f=0.0;f=addop3(layer1_size,f,neu1,syn1neg,l2);if f>max_exp;g=(label-1)*alpha;;;elseif f<(-max_exp);g=label*alpha;;;else;g=(label-exptable(trunc(Int64,parse(string(((f +max_exp)*__denominator))))))*alpha;;;end;;;neu1e=addop2(layer1_size,neu1e,0,g,syn1neg,l2);syn1neg=addop2(layer1_size,syn1neg,l2,g,neu1,0);;end;;;end;;for j::Int64 = b:window*2-b;if j!=window;c=sentence_position-window+j;;if c<0||c>=sentence_length;continue;;end;;if c>=size(sen)[1];continue;;end;;last_word=sen[c +1];;syn0=addop(layer1_size,syn0,last_word*layer1_size,neu1e);;;end;;end;;;end;sentence_position+=1;if sentence_position>=sentence_length;sentence_length=0;;end;;end;;[syn0,syn1neg];;end;;" You can convert word2vec.rb
57. ### Next Problem: How To Run a Julia Program from Ruby

For example: Run the external program like this? Process.spawn(\"echo 'p 123' | julializer | julia\", :out=>”STDOUT") Obviously not good solution(you need to marshal data manually + Julia language VM must be booted up at every single function call)
58. ### Idea: IPC With Julia What if we can pass arbitrary

Ruby value to running Julia background process throughout Module via IPC?
59. ### Introducing virtual_module github.com/remore/virtual_module An IPC module generator Julia and Python

are supported as a background process Marshaling with msgpack
60. ### Sample Usage(1): Calling Julia from Ruby jl = VirtualModule.new(:julia=>["Clustering"]) include

jl r = Clustering.kmeans(jl.rand(5, 1000), 20, maxiter:200, display: :iter) p Clustering.assignments(r) # [3, 13, 2, 7, 15, 12, 10, ... 13, 1, 8]
61. ### jl = VirtualModule.new(:julia=>["Clustering"]) include jl r = Clustering.kmeans(jl.rand(5, 1000), 20,

maxiter:200, display: :iter) p Clustering.assignments(r) # [3, 13, 2, 7, 15, 12, 10, ... 13, 1, 8] Sample Usage(1): Calling Julia from Ruby By calling VirtualModule#new method, Julia background process is booted up and starts to idle
62. ### jl = VirtualModule.new(:julia=>["Clustering"]) include jl r = Clustering.kmeans(jl.rand(5, 1000), 20,

maxiter:200, display: :iter) p Clustering.assignments(r) # [3, 13, 2, 7, 15, 12, 10, ... 13, 1, 8] Sample Usage(1): Calling Julia from Ruby VirtualModule#new method will give you back an instance of Module class
63. ### jl = VirtualModule.new(:julia=>["Clustering"]) include jl r = Clustering.kmeans(jl.rand(5, 1000), 20,

maxiter:200, display: :iter) p Clustering.assignments(r) # [3, 13, 2, 7, 15, 12, 10, ... 13, 1, 8] Sample Usage(1): Calling Julia from Ruby Since it’s an instance of Module class, you can #include it
64. ### jl = VirtualModule.new(:julia=>["Clustering"]) include jl r = Clustering.kmeans(jl.rand(5, 1000), 20,

maxiter:200, display: :iter) p Clustering.assignments(r) # [3, 13, 2, 7, 15, 12, 10, ... 13, 1, 8] Sample Usage(1): Calling Julia from Ruby Now is the time to call arbitrary function in Julia. Every single parameters passed in Ruby’s world is converted to Julia’s value by msgpack
65. ### jl = VirtualModule.new(:julia=>["Clustering"]) include jl r = Clustering.kmeans(jl.rand(5, 1000), 20,

maxiter:200, display: :iter) p Clustering.assignments(r) # [3, 13, 2, 7, 15, 12, 10, ... 13, 1, 8] Sample Usage(1): Calling Julia from Ruby msgpack does convert only basic data types(such as Integer, String Array etc). In this case, since kmeans function returns a value of `Clustering.KmeansResult{Float64}` Type, r is still an instance of Module class which keep pointer to `Clustering.KmeansResult{Float64}` value kept in background process.
66. ### jl = VirtualModule.new(:julia=>["Clustering"]) include jl r = Clustering.kmeans(jl.rand(5, 1000), 20,

maxiter:200, display: :iter) p Clustering.assignments(r) # [3, 13, 2, 7, 15, 12, 10, ... 13, 1, 8] Sample Usage(1): Calling Julia from Ruby Since Clustering.assignments function returns basic data type which can be converted to Ruby’s Array, finally we’ve got the clustering result!
67. ### Sample Usage(2): Calling Python(sklearn) from Ruby skl = VirtualModule.new( :lang=>:python,

:pkgs=>["sklearn"=>["datasets", "svm", "grid_search", “cross_validation"]] ) include skl iris = datasets.load_iris(:_) clf = grid_search.GridSearchCV( svm.LinearSVC(:_), {'C':[1, 3, 5],'loss':['hinge', 'squared_hinge']}, verbose:0 ) clf.fit(iris.data, iris.target) p "Best Params: #{best_params = clf.best_params_}" #"Best Params: {\"loss\"=>\"squared_hinge\", \"C\"=>1}" score = cross_validation.cross_val_score( svm.LinearSVC(loss:'squared_hinge', C:1), iris.data, iris.target, cv:5 ) p "Scores: #{[:mean,:min,:max,:std].map{|e| e.to_s + '=' + score.send(e, :_).to_s }.join(',')}" # "Scores: mean=0.9666666666666668,min=0.9,max=1.0,std=0.04216370213557838"
68. ### Sample Usage(3): Defining Custom Methods With Julializer vm = VirtualModule.new(methods:<<EOS,

:transpiler=>->(s) {Julializer.ruby2julia(s)}) def init_table(list) for i in 0..list.size-1 list[i]+=Random.rand end list end EOS p vm.init_table([1,20]) # [1.3066601775641218, 20.17001189249985]
69. ### Sample Snippets: remore/virtual_module/blob/master/example/calc.rb remore/virtual_module/blob/master/example/scipy.rb remore/virtual_module/blob/master/example/word2vec.rb Corresponding Blog: rimuru.lunanet.gr.jp/blog/calling-python-and-julia-libraries- from-ruby More

Details Can Be Found At
70. ### \$ SRC=\$(cat << EOS "p VirtualModule.new(methods:<<METHOD).sample_loop(2.5) def sample_loop(n) for i

in 1..\\$N n = i+n end n end METHOD" EOS ) Let’s Run The Simple Huge Loop
71. ### > N=1e5 && eval echo "\$SRC" | time ruby -r

virtual_module 5000050002.5 2.68 real 1.58 user 0.36 sys 0sec 0.45sec 0.9sec 1.35sec 1.8sec 10 100 1000 10000 1e5 Ruby Python C Julia VirtualModule

73. ### 0sec 0.75sec 1.5sec 2.25sec 3sec 10 100 1000 10000 1e5

1e6 Ruby Python C Julia VirtualModule > N=1e6 && eval echo "\$SRC" | time ruby -r virtual_module 500000500002.5 2.20 real 1.46 user 0.25 sys
74. ### 0sec 0.75sec 1.5sec 2.25sec 3sec 10 100 1000 10000 1e5

1e6 1e7 Ruby Python C Julia VirtualModule > N=1e7 && eval echo "\$SRC" | time ruby -r virtual_module 50000005000002.5 1.68 real 1.51 user 0.21 sys
75. ### 0sec 12.5sec 25sec 37.5sec 50sec 10 100 1000 10000 1e5

1e6 1e7 1e8 Ruby Python C Julia VirtualModule > N=1e8 && eval echo "\$SRC" | time ruby -r virtual_module 5.000000050000003e+15 1.95 real 1.75 user 0.21 sys
76. ### 0sec 150sec 300sec 450sec 600sec 10 100 1000 10000 1e5

1e6 1e7 1e8 1e9 Ruby Python C Julia VirtualModule > N=1e9 && eval echo "\$SRC" | time ruby -r virtual_module 5.00000000067109e+17 4.50 real 4.29 user 0.21 sys

78. ### \$ cd example \$ ruby word2vec.rb --output /tmp/vectors.bin --train ../doc/benchmark_word2vec/training_data/

10mb.txt --size 20 --window 10 --negative 5 --sample 1e-4 --binary 1 --iter 3 --debug 0 > /dev/null 2>&1 \$ python Python 2.7.12 (default, Jul 1 2016, 15:12:24) [GCC 5.4.0 20160609] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import gensim >>> model = gensim.models.Word2Vec.load_word2vec_format('/tmp/vectors.bin', binary=True) >>> model.most_similar("japan") [(u'netherlands', 0.9741939902305603), (u'china', 0.9712631702423096), (u'county', 0.9686408042907715), (u'spaniards', 0.9669440388679504), (u'vienna', 0.9614173769950867), (u'abu', 0.9587018489837646), (u'korea', 0.9565504789352417), (u'canberra', 0.954473614692688), (u'erupts', 0.9540712833404541), (u'prefecture', 0.9534248113632202)] Benchmarking Using Pure-Ruby word2vec implementation

82. ### Yes done, in a sense of making pure-Ruby word2vec program

more than 3x faster But…
83. ### But Still Problematic If the size of the text going

seriously huge, compared to C there is still big gap….
84. ### Chapter 3 Why Slow An ongoing profiling attempt to know

what part of the program cause this performance issue

86. ### \$ cat profile.rb RubyProf.start x=2.5 1.upto(1e4){|i| x=x+i}; p x result

= RubyProf.stop RubyProf::FlatPrinter.new(result).print(STDOUT) \$ ruby -r ruby-prof profile.rb ruby-prof
87. ### \$ ruby -r ruby-prof profile.rb 50005002.5 Measure Mode: wall_time Thread

ID: 70204763724260 Fiber ID: 70204768000900 Total: 0.009597 Sort by: self_time %self total self wait child calls name 52.19 0.010 0.005 0.000 0.005 1 Integer#upto 16.92 0.002 0.002 0.000 0.000 10001 Fixnum#> 15.79 0.002 0.002 0.000 0.000 10000 Fixnum#+ 14.60 0.001 0.001 0.000 0.000 10000 Float#+ 0.22 0.010 0.000 0.000 0.010 1 Global#[No method] 0.19 0.000 0.000 0.000 0.000 1 Kernel#p 0.09 0.000 0.000 0.000 0.000 1 Float#inspect #upto is the slowest. #> is also slow.

89. ### \$ ruby -e “ printf RubyVM::InstructionSequence.compile( 'x=2.5; 1.upto(1e4){ |i| x

= x+i }’ ).disasm" InstructionSequence
90. ### \$ ruby -e "printf RubyVM::InstructionSequence.compile('x=2.5; 1.upto(1e4){ |i| x = x+i

}').disasm" == disasm: #<ISeq:<compiled>@<compiled>>================================ == catch table | catch type: break st: 0006 ed: 0013 sp: 0000 cont: 0013 |------------------------------------------------------------------------ local table (size: 2, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: [email protected], kwrest: -1]) [ 2] x 0000 trace 1 ( 1) 0002 putobject 2.5 0004 setlocal_OP__WC__0 2 0006 putobject_OP_INT2FIX_O_1_C_ 0007 putobject 10000.0 0009 send <callinfo!mid:upto, argc:1>, <callcache>, block in <compiled> 0013 leave == disasm: #<ISeq:block in <compiled>@<compiled>>======================= == catch table | catch type: redo st: 0002 ed: 0014 sp: 0000 cont: 0002 | catch type: next st: 0002 ed: 0014 sp: 0000 cont: 0014 |------------------------------------------------------------------------ local table (size: 2, argc: 1 [opts: 0, rest: -1, post: 0, block: -1, kw: [email protected], kwrest: -1]) [ 2] i<Arg> 0000 trace 256 ( 1) 0002 trace 1 0004 getlocal_OP__WC__1 2 0006 getlocal_OP__WC__0 2 0008 opt_plus <callinfo!mid:+, argc:1, ARGS_SIMPLE>, <callcache> 0011 dup 0012 setlocal_OP__WC__1 2 0014 trace 512 0016 leave will help us understand what’s happening internally
91. ### \$ ruby -e "printf RubyVM::InstructionSequence.compile('x=2.5; 1.upto(1e4){ |i| x = x+i

}').disasm" == disasm: #<ISeq:<compiled>@<compiled>>================================ == catch table | catch type: break st: 0006 ed: 0013 sp: 0000 cont: 0013 |------------------------------------------------------------------------ local table (size: 2, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: [email protected], kwrest: -1]) [ 2] x 0000 trace 1 ( 1) 0002 putobject 2.5 0004 setlocal_OP__WC__0 2 0006 putobject_OP_INT2FIX_O_1_C_ 0007 putobject 10000.0 0009 send <callinfo!mid:upto, argc:1>, <callcache>, block in <compiled> 0013 leave == disasm: #<ISeq:block in <compiled>@<compiled>>======================= == catch table | catch type: redo st: 0002 ed: 0014 sp: 0000 cont: 0002 | catch type: next st: 0002 ed: 0014 sp: 0000 cont: 0014 |------------------------------------------------------------------------ local table (size: 2, argc: 1 [opts: 0, rest: -1, post: 0, block: -1, kw: [email protected], kwrest: -1]) [ 2] i<Arg> 0000 trace 256 ( 1) 0002 trace 1 0004 getlocal_OP__WC__1 2 0006 getlocal_OP__WC__0 2 0008 opt_plus <callinfo!mid:+, argc:1, ARGS_SIMPLE>, <callcache> 0011 dup 0012 setlocal_OP__WC__1 2 0014 trace 512 0016 leave But which instruction on earth is really slow?

93. ### Introducing yarv-prof github.com/remore/yarv-prof A tiny DTrace-Based YARV profiler Instrumented profiling

with walltime or cputime Only basic dataset are provided so far. Still under development
94. ### require 'yarv-prof' YarvProf.start(clock: :cpu, out:'~/log/') x=2.5 1.upto(N){|i| x=x+i } p

x YarvProf.end yarv-prof

continued)
97. ### Chapter 4 Your Turn! Why not to attempt by yourself

towards Ruby 3x3? Or even “5xRuby”?