Slide 1

Slide 1 text

CONCURRENCY IN RUBY 3 JOSEP EGEA [email protected]

Slide 2

Slide 2 text

KICK OFF!!

Slide 3

Slide 3 text

THANKS FOR COMING!!! Thanks to members of Madrid.rb Thanks to people from outside of Madrid! Thanks to our Sponsors! We ❤ you!

Slide 4

Slide 4 text

WHO'S JOSEP Software developer & Entrepreneur [email protected] https://www.josepegea.com https://github.com/josepegea

Slide 5

Slide 5 text

ABOUT PLATFORM161 / VERVE GROUP Working with Platform161.com / Verve Group And, yes! We're hiring! https://platform161.com https://verve.com

Slide 6

Slide 6 text

ABOUT TODAY

Slide 7

Slide 7 text

THIS IS NOT NOT Expert advice on Ruby concurrency In fact, I learned most of this while preparing this talk, so take it with as much salt as your nutritionist would allow Seriously, I'd like you to try this in production, but if you break something, you take responsability!! Most of what we'll see are just adhoc playground tests

Slide 8

Slide 8 text

EXPECTED TAKEWAYS Review concurrency options in Ruby Test the new concurrency features in Ruby 3 Provide some pragmatic reasons for upgrading Examine some use cases Get you closer to Ruby concurrent code

Slide 9

Slide 9 text

GENERAL TOPICS

Slide 10

Slide 10 text

CONCURRENCY VS PARALLELISM Credits: Stack Over ow - https://stackover ow.com/questions/1050222/

Slide 11

Slide 11 text

PREEMPTIVE VS COOPERATIVE MULTITASKING In Cooperative systems, each task gives back control In Preemptive systems, the OS takes away control from the task

Slide 12

Slide 12 text

TOOLS AT OUR DISPOSAL Multitasking Data Sharing Problems Several Computers Yes Only Network Complex OS Processes Yes IPC/Disk Heavy on memory usage Threads Yes Everything shared Race conditions Fibers No Everything shared Manual Scheduling Event Loops No Everything shared Callback hell

Slide 13

Slide 13 text

WHAT IS BETTER???

Slide 14

Slide 14 text

SHOULDN'T PREEMPTIVE BE ALWAYS BETTER? More resource usage Explicit coordination can be more ef cient than forced takeovers Synchronization is hard!!

Slide 15

Slide 15 text

BUT COOPERATIVE IS HARD, TOO Event loops turn your program upside down (callback hell) Fibers are more natural, but still, deciding when to yield control is not easy

Slide 16

Slide 16 text

COMMON SCENARIOS Server dealing with several clients Puma, Falcon, Thin A batch job master that manages several workers Sidekiq Simple tasks that require some concurrency Querying several external services from a Rails request

Slide 17

Slide 17 text

CONCURRENCY IN RUBY

Slide 18

Slide 18 text

BEFORE RUBY 3 Processes Threads Event loops (through external libs) Fibers

Slide 19

Slide 19 text

NEW ON RUBY 3 Fiber Scheduler Ractors

Slide 20

Slide 20 text

FIBER SCHEDULER IN A SENTENCE Concurrent IO with almost no changes Instead of having to orquestrate the Fibers yourself, the Scheduler does that for you, automatically, when they get blocked by IO ops. You need an external scheduler (Ruby 3 only provides the interface for it). Thanks to Async now there's one.

Slide 21

Slide 21 text

RACTORS IN A SENTENCE Threads not blocked by the GIL Ractors inside a Thread can run in truly parallel fashion, but in exchange they can share data only in certain ways. They're also an experimental feature, as of today (May'21)

Slide 22

Slide 22 text

PLAY TIME!!

Slide 23

Slide 23 text

A SAMPLE JOB Get 2 pieces of data from external API's Process the combination Save the results to disk Send the results to an external API

Slide 24

Slide 24 text

SEQUENTIAL IMPLEMENTATION $data = read_text $replacement = get_replacement $results = process_data($data, $replacement) save_results($results) upload_results($results)

Slide 25

Slide 25 text

MULTITHREADED IMPLEMENTATION threads = [] threads << Thread.new { $data = read_text } threads << Thread.new { $replacement = get_replacement } threads.map(&:join) $results = process_data($data, $replacement) threads = [] threads << Thread.new { save_results($results) } threads << Thread.new { upload_results($results) } threads.map(&:join)

Slide 26

Slide 26 text

FIBER SCHEDULER IMPLEMENTATION Async do Async { $data = read_text } Async { $replacement = get_replacement } end $results = process_data($data, $replacement) Async do Async { save_results($results) } Async { upload_results($results) } end

Slide 27

Slide 27 text

SO … Fiber is almost as concurrent as threads Fiber is less resource intensive than threads The Fiber code is notoriously simple! By de nition, Fibers are thread safe But less concurrent because some calls block: DNS lookups File operations

Slide 28

Slide 28 text

WHAT ABOUT CPU BOUND TASKS Threads are truly preemptive, so they should be able to do concurrent CPU bound tasks But the GIL allows only one thread to run Ruby code at a time Ractors are here to solve just this Let's try them out!

Slide 29

Slide 29 text

SIMPLE EXAMPLE Get data Run the processing task 5 times The tasks share no data

Slide 30

Slide 30 text

SEQUENTIAL IMPLEMENTATION $data = read_text 5.times do |idx| process_data($data.dup, 'test') end

Slide 31

Slide 31 text

FIBER IMPLEMENTATION $data = read_text Async do 5.times do |idx| Async do process_data($data.dup, 'test') end end end

Slide 32

Slide 32 text

THREADS IMPLEMENTATION $data = read_text threads = [] 5.times do |idx| threads << Thread.new do process_data($data.dup, 'test') end end threads.map(&:join)

Slide 33

Slide 33 text

RACTORS IMPLEMENTATION $data = read_text.freeze ractors = [] 5.times do |idx| ractors << Ractor.new(idx, $data) do |i, data| process_data(data, 'test') end end ractors.map(&:take)

Slide 34

Slide 34 text

SO … As expected, Fibers are not useful here Even Threads, despite preemtiveness, are far from ideal paralellism, because of the GIL Ractors do deliver the promised parallelism (in a multicore setup) But (there's always a but…) Ractors are still experimental Data sharing among Ractors is more constrained

Slide 35

Slide 35 text

THIRD EXAMPLE: PAGING THROUGH AN API Get data from an API in pages Process each page Upload each of the results

Slide 36

Slide 36 text

SEQUENTIAL IMPLEMENTATION begin page_data = get_from_api(page_index: $current_page, page_size: $total_pages ||= page_data['total_pages'] new_data = process_api_data(page_data['data']) upload_results(new_data) $current_page += 1 end while $current_page <= $total_pages

Slide 37

Slide 37 text

THREADED IMPLEMENTATION $down_queue = Queue.new $up_queue = Queue.new $threads = [] $threads << Thread.new do begin page_data = get_from_api(page_index: page_index, page_size: p $down_queue << page_data['data'] total_pages ||= page_data['total_pages'] current_page += 1 end while current_page <= total_pages $down queue.close

Slide 38

Slide 38 text

FIBER IMPLEMENTATION $down_queue = Queue.new $up_queue = Queue.new Async do Async do begin page_data = get_from_api(page_index: $page_index, page_size $down_queue << page_data['data'] $total_pages ||= page_data['total_pages'] $current_page += 1 end while $current_page <= $total_pages $down queue.close

Slide 39

Slide 39 text

SO… Again, Fibers are almost as concurrent as Threads We cannot use Ractors because of data sharing issues with the URI gem

Slide 40

Slide 40 text

RECAP

Slide 41

Slide 41 text

WHAT WE LEARNED… Ruby has lots of ways for concurrency!! The Fiber Scheduler, on its own, makes it worth to upgrade to Ruby 3 For general concurrency problems, though, Threads are still the best option Ractors show some real promise!! Ruby will never be a language for CPU intensive tasks, but, once they mature, Ractors should make them easier

Slide 42

Slide 42 text

DID WE GET WHAT WE WANTED??? Review concurrency options in Ruby Test the new concurrency features in Ruby 3 Provide some pragmatic reasons for upgrading Examine some use cases Get you closer to Ruby concurrent code I hope you did, and had some fun, too!!

Slide 43

Slide 43 text

REFERENCES Ruby Documentation Additional Gems Great explanations Code used in this talk https://www.ruby-lang.org/en/news/2020/12/25/ruby-3-0-0-released https://github.com/ruby/ruby/blob/master/doc/ ber.md https://ruby-doc.org/core-3.0.1/Fiber.html https://ruby-doc.org/core-3.0.1/Thread.html https://github.com/ruby/ruby/blob/master/doc/ractor.md https://ruby-doc.org/core-3.0.1/Ractor.html https://github.com/socketry/async Don't Wait For Me! by Samuel Williams Ruby 3 and the new Fiber Scheduler Interface, by Wander Hillen https://github.com/josepegea/async_test

Slide 44

Slide 44 text

BIG THANKS!! Hope you had some fun!! | [email protected] www.josepegea.com