$30 off During Our Annual Pro Sale. View Details »

ConcurrencyInRuby3-JosepEgea.pdf

 ConcurrencyInRuby3-JosepEgea.pdf

Ruby 3 introduced 2 interesting novelties: The Fiber Scheduler and Ractors

In this talk we'll review all the concurrency options available in Ruby, both before and after Ruby 3, including Threads, Fibers and Ractors, and we'll try to visualize how each of them can be used with live coding examples.

Join us and discover why one execution flow per program is not enough!!

Presented by Josep Egea in Madrid.rb in May 2021

https://www.madridrb.com/topics/concurrency-in-ruby-3-933
https://josepegea.com

Josep Egea

May 31, 2021
Tweet

More Decks by Josep Egea

Other Decks in Programming

Transcript

  1. CONCURRENCY IN RUBY 3
    JOSEP EGEA
    [email protected]

    View Slide

  2. KICK OFF!!

    View Slide

  3. THANKS FOR COMING!!!
    Thanks to members of Madrid.rb
    Thanks to people from outside of Madrid!
    Thanks to our Sponsors! We ❤ you!

    View Slide

  4. WHO'S JOSEP
    Software developer & Entrepreneur
    [email protected]
    https://www.josepegea.com
    https://github.com/josepegea

    View Slide

  5. ABOUT PLATFORM161 / VERVE GROUP
    Working with Platform161.com / Verve Group
    And, yes! We're hiring!
    https://platform161.com
    https://verve.com

    View Slide

  6. ABOUT TODAY

    View Slide

  7. THIS IS NOT
    NOT Expert advice on Ruby concurrency
    In fact, I learned most of this while preparing this
    talk, so take it with as much salt as your
    nutritionist would allow
    Seriously, I'd like you to try this in production, but
    if you break something, you take responsability!!
    Most of what we'll see are just adhoc playground
    tests

    View Slide

  8. EXPECTED TAKEWAYS
    Review concurrency options in Ruby
    Test the new concurrency features in Ruby 3
    Provide some pragmatic reasons for upgrading
    Examine some use cases
    Get you closer to Ruby concurrent code

    View Slide

  9. GENERAL TOPICS

    View Slide

  10. CONCURRENCY VS PARALLELISM
    Credits: Stack Over ow - https://stackover ow.com/questions/1050222/

    View Slide

  11. PREEMPTIVE VS COOPERATIVE MULTITASKING
    In Cooperative systems, each task gives back control
    In Preemptive systems, the OS takes away control from the task

    View Slide

  12. TOOLS AT OUR DISPOSAL
    Multitasking Data Sharing Problems
    Several Computers Yes Only Network Complex
    OS Processes Yes IPC/Disk Heavy on memory usage
    Threads Yes Everything shared Race conditions
    Fibers No Everything shared Manual Scheduling
    Event Loops No Everything shared Callback hell

    View Slide

  13. WHAT IS BETTER???

    View Slide

  14. SHOULDN'T PREEMPTIVE BE ALWAYS BETTER?
    More resource usage
    Explicit coordination can be more ef cient than
    forced takeovers
    Synchronization is hard!!

    View Slide

  15. BUT COOPERATIVE IS HARD, TOO
    Event loops turn your program upside down
    (callback hell)
    Fibers are more natural, but still, deciding when to
    yield control is not easy

    View Slide

  16. COMMON SCENARIOS
    Server dealing with several clients
    Puma, Falcon, Thin
    A batch job master that manages several workers
    Sidekiq
    Simple tasks that require some concurrency
    Querying several external services from a
    Rails request

    View Slide

  17. CONCURRENCY IN RUBY

    View Slide

  18. BEFORE RUBY 3
    Processes
    Threads
    Event loops (through external libs)
    Fibers

    View Slide

  19. NEW ON RUBY 3
    Fiber Scheduler
    Ractors

    View Slide

  20. FIBER SCHEDULER IN A SENTENCE
    Concurrent IO with almost no changes
    Instead of having to orquestrate the Fibers yourself, the Scheduler does that for you,
    automatically, when they get blocked by IO ops.
    You need an external scheduler (Ruby 3 only provides the interface for it). Thanks to Async
    now there's one.

    View Slide

  21. RACTORS IN A SENTENCE
    Threads not blocked by the GIL
    Ractors inside a Thread can run in truly parallel fashion, but in exchange they can share data
    only in certain ways.
    They're also an experimental feature, as of today (May'21)

    View Slide

  22. PLAY TIME!!

    View Slide

  23. A SAMPLE JOB
    Get 2 pieces of data from external API's
    Process the combination
    Save the results to disk
    Send the results to an external API

    View Slide

  24. SEQUENTIAL IMPLEMENTATION
    $data = read_text
    $replacement = get_replacement
    $results = process_data($data, $replacement)
    save_results($results)
    upload_results($results)

    View Slide

  25. MULTITHREADED IMPLEMENTATION
    threads = []
    threads << Thread.new { $data = read_text }
    threads << Thread.new { $replacement = get_replacement }
    threads.map(&:join)
    $results = process_data($data, $replacement)
    threads = []
    threads << Thread.new { save_results($results) }
    threads << Thread.new { upload_results($results) }
    threads.map(&:join)

    View Slide

  26. FIBER SCHEDULER IMPLEMENTATION
    Async do
    Async { $data = read_text }
    Async { $replacement = get_replacement }
    end
    $results = process_data($data, $replacement)
    Async do
    Async { save_results($results) }
    Async { upload_results($results) }
    end

    View Slide

  27. SO …
    Fiber is almost as concurrent as threads
    Fiber is less resource intensive than threads
    The Fiber code is notoriously simple!
    By de nition, Fibers are thread safe
    But less concurrent because some calls block:
    DNS lookups
    File operations

    View Slide

  28. WHAT ABOUT CPU BOUND TASKS
    Threads are truly preemptive, so they should be
    able to do concurrent CPU bound tasks
    But the GIL allows only one thread to run Ruby
    code at a time
    Ractors are here to solve just this
    Let's try them out!

    View Slide

  29. SIMPLE EXAMPLE
    Get data
    Run the processing task 5 times
    The tasks share no data

    View Slide

  30. SEQUENTIAL IMPLEMENTATION
    $data = read_text
    5.times do |idx|
    process_data($data.dup, 'test')
    end

    View Slide

  31. FIBER IMPLEMENTATION
    $data = read_text
    Async do
    5.times do |idx|
    Async do
    process_data($data.dup, 'test')
    end
    end
    end

    View Slide

  32. THREADS IMPLEMENTATION
    $data = read_text
    threads = []
    5.times do |idx|
    threads << Thread.new do
    process_data($data.dup, 'test')
    end
    end
    threads.map(&:join)

    View Slide

  33. RACTORS IMPLEMENTATION
    $data = read_text.freeze
    ractors = []
    5.times do |idx|
    ractors << Ractor.new(idx, $data) do |i, data|
    process_data(data, 'test')
    end
    end
    ractors.map(&:take)

    View Slide

  34. SO …
    As expected, Fibers are not useful here
    Even Threads, despite preemtiveness, are far from
    ideal paralellism, because of the GIL
    Ractors do deliver the promised parallelism (in a
    multicore setup)
    But (there's always a but…)
    Ractors are still experimental
    Data sharing among Ractors is more
    constrained

    View Slide

  35. THIRD EXAMPLE: PAGING THROUGH AN API
    Get data from an API in pages
    Process each page
    Upload each of the results

    View Slide

  36. SEQUENTIAL IMPLEMENTATION
    begin
    page_data = get_from_api(page_index: $current_page, page_size:
    $total_pages ||= page_data['total_pages']
    new_data = process_api_data(page_data['data'])
    upload_results(new_data)
    $current_page += 1
    end while $current_page <= $total_pages

    View Slide

  37. THREADED IMPLEMENTATION
    $down_queue = Queue.new
    $up_queue = Queue.new
    $threads = []
    $threads << Thread.new do
    begin
    page_data = get_from_api(page_index: page_index, page_size: p
    $down_queue << page_data['data']
    total_pages ||= page_data['total_pages']
    current_page += 1
    end while current_page <= total_pages
    $down queue.close

    View Slide

  38. FIBER IMPLEMENTATION
    $down_queue = Queue.new
    $up_queue = Queue.new
    Async do
    Async do
    begin
    page_data = get_from_api(page_index: $page_index, page_size
    $down_queue << page_data['data']
    $total_pages ||= page_data['total_pages']
    $current_page += 1
    end while $current_page <= $total_pages
    $down queue.close

    View Slide

  39. SO…
    Again, Fibers are almost as concurrent as Threads
    We cannot use Ractors because of data sharing
    issues with the URI gem

    View Slide

  40. RECAP

    View Slide

  41. WHAT WE LEARNED…
    Ruby has lots of ways for concurrency!!
    The Fiber Scheduler, on its own, makes it worth to
    upgrade to Ruby 3
    For general concurrency problems, though,
    Threads are still the best option
    Ractors show some real promise!! Ruby will never
    be a language for CPU intensive tasks, but, once
    they mature, Ractors should make them easier

    View Slide

  42. DID WE GET WHAT WE WANTED???
    Review concurrency options in Ruby
    Test the new concurrency features in Ruby 3
    Provide some pragmatic reasons for upgrading
    Examine some use cases
    Get you closer to Ruby concurrent code
    I hope you did, and had some fun, too!!

    View Slide

  43. REFERENCES
    Ruby Documentation
    Additional Gems
    Great explanations
    Code used in this talk
    https://www.ruby-lang.org/en/news/2020/12/25/ruby-3-0-0-released
    https://github.com/ruby/ruby/blob/master/doc/ ber.md
    https://ruby-doc.org/core-3.0.1/Fiber.html
    https://ruby-doc.org/core-3.0.1/Thread.html
    https://github.com/ruby/ruby/blob/master/doc/ractor.md
    https://ruby-doc.org/core-3.0.1/Ractor.html
    https://github.com/socketry/async
    Don't Wait For Me! by Samuel Williams
    Ruby 3 and the new Fiber Scheduler Interface, by Wander Hillen
    https://github.com/josepegea/async_test

    View Slide

  44. BIG THANKS!!
    Hope you had some fun!!
    |
    [email protected] www.josepegea.com

    View Slide