Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ConcurrencyInRuby3-JosepEgea.pdf

 ConcurrencyInRuby3-JosepEgea.pdf

Ruby 3 introduced 2 interesting novelties: The Fiber Scheduler and Ractors

In this talk we'll review all the concurrency options available in Ruby, both before and after Ruby 3, including Threads, Fibers and Ractors, and we'll try to visualize how each of them can be used with live coding examples.

Join us and discover why one execution flow per program is not enough!!

Presented by Josep Egea in Madrid.rb in May 2021

https://www.madridrb.com/topics/concurrency-in-ruby-3-933
https://josepegea.com

Josep Egea

May 31, 2021
Tweet

More Decks by Josep Egea

Other Decks in Programming

Transcript

  1. THANKS FOR COMING!!! Thanks to members of Madrid.rb Thanks to

    people from outside of Madrid! Thanks to our Sponsors! We ❤ you!
  2. ABOUT PLATFORM161 / VERVE GROUP Working with Platform161.com / Verve

    Group And, yes! We're hiring! https://platform161.com https://verve.com
  3. THIS IS NOT NOT Expert advice on Ruby concurrency In

    fact, I learned most of this while preparing this talk, so take it with as much salt as your nutritionist would allow Seriously, I'd like you to try this in production, but if you break something, you take responsability!! Most of what we'll see are just adhoc playground tests
  4. EXPECTED TAKEWAYS Review concurrency options in Ruby Test the new

    concurrency features in Ruby 3 Provide some pragmatic reasons for upgrading Examine some use cases Get you closer to Ruby concurrent code
  5. PREEMPTIVE VS COOPERATIVE MULTITASKING In Cooperative systems, each task gives

    back control In Preemptive systems, the OS takes away control from the task
  6. TOOLS AT OUR DISPOSAL Multitasking Data Sharing Problems Several Computers

    Yes Only Network Complex OS Processes Yes IPC/Disk Heavy on memory usage Threads Yes Everything shared Race conditions Fibers No Everything shared Manual Scheduling Event Loops No Everything shared Callback hell
  7. SHOULDN'T PREEMPTIVE BE ALWAYS BETTER? More resource usage Explicit coordination

    can be more ef cient than forced takeovers Synchronization is hard!!
  8. BUT COOPERATIVE IS HARD, TOO Event loops turn your program

    upside down (callback hell) Fibers are more natural, but still, deciding when to yield control is not easy
  9. COMMON SCENARIOS Server dealing with several clients Puma, Falcon, Thin

    A batch job master that manages several workers Sidekiq Simple tasks that require some concurrency Querying several external services from a Rails request
  10. FIBER SCHEDULER IN A SENTENCE Concurrent IO with almost no

    changes Instead of having to orquestrate the Fibers yourself, the Scheduler does that for you, automatically, when they get blocked by IO ops. You need an external scheduler (Ruby 3 only provides the interface for it). Thanks to Async now there's one.
  11. RACTORS IN A SENTENCE Threads not blocked by the GIL

    Ractors inside a Thread can run in truly parallel fashion, but in exchange they can share data only in certain ways. They're also an experimental feature, as of today (May'21)
  12. A SAMPLE JOB Get 2 pieces of data from external

    API's Process the combination Save the results to disk Send the results to an external API
  13. SEQUENTIAL IMPLEMENTATION $data = read_text $replacement = get_replacement $results =

    process_data($data, $replacement) save_results($results) upload_results($results)
  14. MULTITHREADED IMPLEMENTATION threads = [] threads << Thread.new { $data

    = read_text } threads << Thread.new { $replacement = get_replacement } threads.map(&:join) $results = process_data($data, $replacement) threads = [] threads << Thread.new { save_results($results) } threads << Thread.new { upload_results($results) } threads.map(&:join)
  15. FIBER SCHEDULER IMPLEMENTATION Async do Async { $data = read_text

    } Async { $replacement = get_replacement } end $results = process_data($data, $replacement) Async do Async { save_results($results) } Async { upload_results($results) } end
  16. SO … Fiber is almost as concurrent as threads Fiber

    is less resource intensive than threads The Fiber code is notoriously simple! By de nition, Fibers are thread safe But less concurrent because some calls block: DNS lookups File operations
  17. WHAT ABOUT CPU BOUND TASKS Threads are truly preemptive, so

    they should be able to do concurrent CPU bound tasks But the GIL allows only one thread to run Ruby code at a time Ractors are here to solve just this Let's try them out!
  18. FIBER IMPLEMENTATION $data = read_text Async do 5.times do |idx|

    Async do process_data($data.dup, 'test') end end end
  19. THREADS IMPLEMENTATION $data = read_text threads = [] 5.times do

    |idx| threads << Thread.new do process_data($data.dup, 'test') end end threads.map(&:join)
  20. RACTORS IMPLEMENTATION $data = read_text.freeze ractors = [] 5.times do

    |idx| ractors << Ractor.new(idx, $data) do |i, data| process_data(data, 'test') end end ractors.map(&:take)
  21. SO … As expected, Fibers are not useful here Even

    Threads, despite preemtiveness, are far from ideal paralellism, because of the GIL Ractors do deliver the promised parallelism (in a multicore setup) But (there's always a but…) Ractors are still experimental Data sharing among Ractors is more constrained
  22. THIRD EXAMPLE: PAGING THROUGH AN API Get data from an

    API in pages Process each page Upload each of the results
  23. SEQUENTIAL IMPLEMENTATION begin page_data = get_from_api(page_index: $current_page, page_size: $total_pages ||=

    page_data['total_pages'] new_data = process_api_data(page_data['data']) upload_results(new_data) $current_page += 1 end while $current_page <= $total_pages
  24. THREADED IMPLEMENTATION $down_queue = Queue.new $up_queue = Queue.new $threads =

    [] $threads << Thread.new do begin page_data = get_from_api(page_index: page_index, page_size: p $down_queue << page_data['data'] total_pages ||= page_data['total_pages'] current_page += 1 end while current_page <= total_pages $down queue.close
  25. FIBER IMPLEMENTATION $down_queue = Queue.new $up_queue = Queue.new Async do

    Async do begin page_data = get_from_api(page_index: $page_index, page_size $down_queue << page_data['data'] $total_pages ||= page_data['total_pages'] $current_page += 1 end while $current_page <= $total_pages $down queue.close
  26. SO… Again, Fibers are almost as concurrent as Threads We

    cannot use Ractors because of data sharing issues with the URI gem
  27. WHAT WE LEARNED… Ruby has lots of ways for concurrency!!

    The Fiber Scheduler, on its own, makes it worth to upgrade to Ruby 3 For general concurrency problems, though, Threads are still the best option Ractors show some real promise!! Ruby will never be a language for CPU intensive tasks, but, once they mature, Ractors should make them easier
  28. DID WE GET WHAT WE WANTED??? Review concurrency options in

    Ruby Test the new concurrency features in Ruby 3 Provide some pragmatic reasons for upgrading Examine some use cases Get you closer to Ruby concurrent code I hope you did, and had some fun, too!!
  29. REFERENCES Ruby Documentation Additional Gems Great explanations Code used in

    this talk https://www.ruby-lang.org/en/news/2020/12/25/ruby-3-0-0-released https://github.com/ruby/ruby/blob/master/doc/ ber.md https://ruby-doc.org/core-3.0.1/Fiber.html https://ruby-doc.org/core-3.0.1/Thread.html https://github.com/ruby/ruby/blob/master/doc/ractor.md https://ruby-doc.org/core-3.0.1/Ractor.html https://github.com/socketry/async Don't Wait For Me! by Samuel Williams Ruby 3 and the new Fiber Scheduler Interface, by Wander Hillen https://github.com/josepegea/async_test