Handling requests in parallel in Rails applications, an introduction

Slide 1

Slide 1 text

Handling requests in parallel in Rails applications, an introduction Florent Guilleux Ruby Peru, 2013-08-01 meeting

Slide 2

Slide 2 text

How requests are handled on a multi-core server with a non concurrent Ruby web server (Thin by default for example...) core #1 request #1 processing request #1 response to #1 request #2 request #2 processing response to #2 core #2 core #3 core #4

Slide 3

Slide 3 text

Only 1 request processed at a time, while 3 out of 4 cores are unused!

Slide 4

Slide 4 text

With a concurrent Ruby web server (Unicorn, Puma...) core #1 request #1 processing request #1 response to #1 request #2 request #2 processing response to #2 core #2 core #3 core #4

Slide 5

Slide 5 text

With a concurrent Ruby web server, all the cores of your server can be used in parallel. Thus your application can process several requests in parallel on a single server.

Slide 6

Slide 6 text

endpoint bound by CPU IO Thin ~ 1 request / sec ~ 1 request / sec Puma ~ 1 request / sec ~ 4 requests / sec For the CPU bound endpoint, Puma does not leverage the 4 cores of the server. Absolute values have no importance here, we’re only interested in relative differences. See https://github.com/Florent2/rails-parallelism-demo for the benchmark code.

Slide 7

Slide 7 text

CPU bound: code execution is bound by the CPU capabilities of the server (PI number calculation, video processing...) IO bound: code execution is bound by the IO capabilities of the server (processing data from disk, requesting over a network...) Also: memory bound, cache bound

Slide 8

Slide 8 text

Why Puma was not handling requests in parallel for the CPU bound endpoint? We were using MRI (the default Ruby implementation). MRI has the GIL (Global Interpreter Lock): A MRI process can not run threads in parallel, except on blocking IO.

Slide 9

Slide 9 text

Unicorn can process requests in parallel because it runs several Ruby processes, instead of using threads like Puma. endpoint bound by CPU IO Thin ~ 1 request / sec ~ 1 request / sec Unicorn with 4 workers ~ 4 requests / sec ~ 4 requests / sec

Slide 10

Slide 10 text

What is the difference between processes and threads? Process #1 Application compiled source code Memory Application compiled source code Memory Thread #1 Process #2 Application compiled source code Memory Process #3 Application compiled source code Memory Thread #2 Thread #3

Slide 11

Slide 11 text

Threads are cheaper than processes: * less overhead to spawn * memory is shared between threads

Slide 12

Slide 12 text

To run any Ruby threads in parallel, you need to use a Ruby implementation without a global lock: Rubinius, JRuby endpoint bound by (with Puma) CPU IO MRI ~ 1 request / sec ~ 4 requests / sec Rubinius ~ 4 requests / sec ~ 4 requests / sec

Slide 13

Slide 13 text

But as threads share memory, two threads can perform an operation on a shared state, leading sometimes to incorrect data.

Slide 14

Slide 14 text

example from Wikipedia

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

JRuby and Rubinius protect their internals from race conditions, without a global lock, with fine- grained locks. They can run threads in parallel. But they do not protect you from thread-safety issues of your own code, or from the gems you use (Rails itself is thread safe by default since version 4).

Slide 17

Slide 17 text

1. A server spawning processes like Unicorn or Passenger -> more memory consumed but no thread safety concerns 2. Or a server using threads (Puma, Passenger 4...) ● with a Ruby implementation without global lock (Rubinius, JRuby) ● or MRI can be sufficient if your application is very IO bound -> better performance but your application (own code, gems, ...) needs to be written thread safe In conclusion to handle requests in parallel with Rails, you need:

Slide 18

Slide 18 text

Highly recommended book about threads in Ruby Working With Ruby Threads