Handling requests in parallel in Rails applications, an introduction

Handling requests in parallel in Rails applications, an introduction Florent
Guilleux Ruby Peru, 2013-08-01 meeting

How requests are handled on a multi-core server with a
non concurrent Ruby web server (Thin by default for example...) core #1 request #1 processing request #1 response to #1 request #2 request #2 processing response to #2 core #2 core #3 core #4

Only 1 request processed at a time, while 3 out
of 4 cores are unused!

With a concurrent Ruby web server (Unicorn, Puma...) core #1
request #1 processing request #1 response to #1 request #2 request #2 processing response to #2 core #2 core #3 core #4

With a concurrent Ruby web server, all the cores of
your server can be used in parallel. Thus your application can process several requests in parallel on a single server.

endpoint bound by CPU IO Thin ~ 1 request /
sec ~ 1 request / sec Puma ~ 1 request / sec ~ 4 requests / sec For the CPU bound endpoint, Puma does not leverage the 4 cores of the server. Absolute values have no importance here, we’re only interested in relative differences. See https://github.com/Florent2/rails-parallelism-demo for the benchmark code.

CPU bound: code execution is bound by the CPU capabilities
of the server (PI number calculation, video processing...) IO bound: code execution is bound by the IO capabilities of the server (processing data from disk, requesting over a network...) Also: memory bound, cache bound

Why Puma was not handling requests in parallel for the
CPU bound endpoint? We were using MRI (the default Ruby implementation). MRI has the GIL (Global Interpreter Lock): A MRI process can not run threads in parallel, except on blocking IO.

Unicorn can process requests in parallel because it runs several
Ruby processes, instead of using threads like Puma. endpoint bound by CPU IO Thin ~ 1 request / sec ~ 1 request / sec Unicorn with 4 workers ~ 4 requests / sec ~ 4 requests / sec

What is the difference between processes and threads? Process #1
Application compiled source code Memory Application compiled source code Memory Thread #1 Process #2 Application compiled source code Memory Process #3 Application compiled source code Memory Thread #2 Thread #3

Threads are cheaper than processes: * less overhead to spawn
* memory is shared between threads

To run any Ruby threads in parallel, you need to
use a Ruby implementation without a global lock: Rubinius, JRuby endpoint bound by (with Puma) CPU IO MRI ~ 1 request / sec ~ 4 requests / sec Rubinius ~ 4 requests / sec ~ 4 requests / sec

But as threads share memory, two threads can perform an
operation on a shared state, leading sometimes to incorrect data.

example from Wikipedia

JRuby and Rubinius protect their internals from race conditions, without
a global lock, with fine- grained locks. They can run threads in parallel. But they do not protect you from thread-safety issues of your own code, or from the gems you use (Rails itself is thread safe by default since version 4).

1. A server spawning processes like Unicorn or Passenger ->
more memory consumed but no thread safety concerns 2. Or a server using threads (Puma, Passenger 4...) • with a Ruby implementation without global lock (Rubinius, JRuby) • or MRI can be sufficient if your application is very IO bound -> better performance but your application (own code, gems, ...) needs to be written thread safe In conclusion to handle requests in parallel with Rails, you need:

Highly recommended book about threads in Ruby Working With Ruby
Threads

Handling requests in parallel in Rails applicat...

Handling requests in parallel in Rails applications, an introduction

Florent Guilleux

More Decks by Florent Guilleux

Other Decks in Programming

Featured

Transcript

Handling requests in parallel in Rails applications, an introduction Florent

How requests are handled on a multi-core server with a

Only 1 request processed at a time, while 3 out

With a concurrent Ruby web server (Unicorn, Puma...) core #1

With a concurrent Ruby web server, all the cores of

endpoint bound by CPU IO Thin ~ 1 request /

CPU bound: code execution is bound by the CPU capabilities

Why Puma was not handling requests in parallel for the

Unicorn can process requests in parallel because it runs several

What is the difference between processes and threads? Process #1

Threads are cheaper than processes: * less overhead to spawn

To run any Ruby threads in parallel, you need to

But as threads share memory, two threads can perform an

example from Wikipedia

JRuby and Rubinius protect their internals from race conditions, without

1. A server spawning processes like Unicorn or Passenger ->

Highly recommended book about threads in Ruby Working With Ruby