Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Using Ractor

Using Ractor

Kanazawa.rb meetup #100 2020/12/19 LT

Kunihiko Ito

December 19, 2020
Tweet

More Decks by Kunihiko Ito

Other Decks in Programming

Transcript

  1. p self • Name: Kunihiko Ito • From: Toyama •

    Job: Rails application programmer • community: Toyama.rb • twitter: @kunitoo
  2. Ractor(Ruby + Actor) • Ruby 3.0.0 から入る新機能 > Ractor はアクターモデル風の並行・並列制御機構であり、スレッド安全に関す

    る懸念をなく、Rubyで並列処理を行うための機能として設計されています。 • 実験的機能ではあるが、GVL(Global VM Lock) の制限が外れマルチコアの性 能を生かすことができると理解していた
  3. Ractor サンプル https://www.ruby-lang.org/ja/news/2020/12/08/ruby-3-0-0-preview2-released/ require 'prime' # n.prime? with sent integers

    in r1, r2 run in parallel r1, r2 = *(1..2).map do Ractor.new do n = Ractor.receive n.prime? end end # send parameters r1.send 2**61 - 1 r2.send 2**61 + 15 # wait for the results of expr1, expr2 p r1.take #=> true p r2.take #=> true
  4. 実行してみる # prime_benchmark.rb require 'benchmark' require 'prime' numbers = [2**61

    - 1, 2**61 + 15] def normal_prime? (numbers) numbers.each {|number| number.prime? } end def ractor_prime? (numbers) ractors = numbers.size.times.map do Ractor.new do n = Ractor.receive n.prime? end end ractors.each.with_index do |ractor, i| ractor.send numbers[i] end ractors.map(&:take) end Benchmark .bm(8) do |x| x.report('normal' ) { normal_prime? (numbers) } x.report('Ractor' ) { ractor_prime? (numbers) } end $ grep processor /proc/cpuinfo | wc -l 12 $ ruby -v ruby 3.0.0preview2 (2020-12-08 master d7a16670c3) [x86_64-linux] $ ruby -W0 prime_benchmark.rb user system total real normal 10.827434 0.001394 10.828828 ( 10.828979) Ractor 11.186445 0.000000 11.186445 ( 5.838678)
  5. Ractor 実行結果 実経過時間がRactorを使うことで1/2ほどになっている $ ruby -W0 prime_benchmark.rb user system total

    real normal 10.827434 0.001394 10.828828 ( 10.828979) Ractor 11.186445 0.000000 11.186445 ( 5.838678)
  6. 実行するプログラム # tsp_benchmark.rb require 'benchmark' require 'concurrent-edge' load './common_tsp.rb' class

    Resolver < Concurrent::Actor::Context def initialize (distances ) @distances = distances end def on_message (routes) routes.map {|route| [route, calc_cost (route, distances )] } end end def concurent_ruby_solve (routes, distances ) split_num = 12 resolver = Resolver .spawn(:resolver , distances ) promises = routes.each_slice (routes.size / split_num ).map {|chunk| resolver .ask(chunk) } promises .flat_map {|promise| promise.value }.sort_by {|r| r[1] } end def normal_solve (routes, distances ) routes.map {|route| [route, calc_cost (route, distances )] }.sort_by {|r| r[1] } end def ractor_solve (routes, distances ) split_num = 12 ractors = routes.each_slice (routes.size / split_num ).map {|chunk| Ractor.new(chunk, distances ) do |rs, distances | rs.map {|route| [ route, calc_cost (route, distances )] } end } ractors.flat_map {|ractor| ractor.take }.sort_by {|r| r[1] } end points = distances .keys routes = all_routes (points) Benchmark .bm(16) do |x| x.report('normal' ) { normal_solve (routes, distances ) } x.report('concurent-ruby' ) { concurent_ruby_solve (routes, distances ) } x.report('Ractor' ) { ractor_solve (routes, distances ) } end
  7. 実行結果(9, 10 都市) $ ruby -W0 tsp_benchmark.rb 9 user system

    total real normal 0.066303 0.002983 0.069286 ( 0.069286) concurent-ruby 0.073570 0.001926 0.075496 ( 0.075137) Ractor 0.159783 0.055173 0.214956 ( 0.094968) $ ruby -W0 tsp_benchmark.rb 10 user system total real normal 0.759226 0.009983 0.769209 ( 0.769203) concurent-ruby 0.705808 0.010045 0.715853 ( 0.715268) Ractor 1.902469 0.468097 2.370566 ( 1.266980)
  8. 実行結果(11 都市) $ ruby -W0 tsp_benchmark.rb 11 user system total

    real normal 8.248986 0.059027 8.308013 ( 8.307973) concurent-ruby 9.591315 0.058012 9.649327 ( 9.648718) Ractor 31.274292 4.293176 35.567468 ( 24.576010) 予想と反して Ractor が1番遅い!!!
  9. なぜRactorが遅かったのか Each Ractor has 1 or more Threads. • Threads

    in a Ractor shares a Ractor-wide global lock like GIL (GVL in MRI terminology), so they can't run in parallel (without releasing GVL explicitly in C-level). • The overhead of creating a Ractor is similar to overhead of one Thread creation. Rubyドキュメントより https://github.com/ruby/ruby/blob/master/doc/ractor.md • Ractor の生成コストが高いために遅くなった? • 計算自体が CPU をフルに使うようなのではないため、利用 Ractor を使い切れ ていない? • 実はパラレルには実行できないので遅い? • GVLの制限が外れてるわけではない?
  10. まとめ • Ruby 3.0.0 から Ractor という新しい平行・並列制御機構が実験的に導入され る • プログラムによっては

    Ractor を利用することで、パーフォーマンスの改善が見 込める • プログラムによってはシングルスレッドの方が速いこともある