Slide 1

Slide 1 text

Truly Madly Deeply Parallel Ruby (Web)* Applications @harikrishnan83 * Thanks @headius

Slide 2

Slide 2 text

How many of you are building web applications?

Slide 3

Slide 3 text

How many of you have more than 1 user using it at a time?

Slide 4

Slide 4 text

How many of you use scaling out as the way hit scale?

Slide 5

Slide 5 text

How many of you are scaling up to hit scale?

Slide 6

Slide 6 text

What is a parallel environment?

Slide 7

Slide 7 text

Parallelism

Slide 8

Slide 8 text

Concurrency Thread A - load Thread B - load Thread A - increment Thread B - increment Thread A - save Thread B - save i = 0 i = 0 i = 1 i = 1

Slide 9

Slide 9 text

Ruby web application deployment

Slide 10

Slide 10 text

Process Parallelism Reverse Proxy Unicorn Processes Database Layer

Slide 11

Slide 11 text

On my current project

Slide 12

Slide 12 text

We only use EC2 small instances

Slide 13

Slide 13 text

Because it is very hard to utilize a high spec machine Process Context Switch is Expensive

Slide 14

Slide 14 text

Today... ● We have 4 small EC2 instances ● 2 Puma processes run on each ● Handles about 100,000 requests per hour ● And this is our Private alpha

Slide 15

Slide 15 text

We need to... ● Handle about 1 million requests per hour ● Which means about 40-45 EC2 small instances

Slide 16

Slide 16 text

This is not trivial ● Costs a lot of money ● Lot of time required to maintain these boxes ● Being elastic will become very important ● Cost also in terms of more Devops time

Slide 17

Slide 17 text

In general

Slide 18

Slide 18 text

It is easier to baby sit few boxes

Slide 19

Slide 19 text

Than a lot!

Slide 20

Slide 20 text

Ideally, we would like to both scale up and scale out

Slide 21

Slide 21 text

i.e. we want to achieve the same throughput with, say, just 5 large instances

Slide 22

Slide 22 text

Enter thread based parallelism

Slide 23

Slide 23 text

Why were we not doing this till now?

Slide 24

Slide 24 text

Threads are hard* They share memory and mutate things They share memory and mutate things * - Supposedly

Slide 25

Slide 25 text

And there is the ubiquitous issue ‘Thread Safety’

Slide 26

Slide 26 text

Before we go there, first lets look at some code

Slide 27

Slide 27 text

No content

Slide 28

Slide 28 text

The real question is...

Slide 29

Slide 29 text

Are you “Safe for Parallelization”

Slide 30

Slide 30 text

Understanding this will take you a long way in “getting parallel”

Slide 31

Slide 31 text

Things to remember while moving to threaded parallelism

Slide 32

Slide 32 text

#1 - Always identify the shared resources

Slide 33

Slide 33 text

Shared Resource ● Objects ● DB rows ● Caches ● Log files!

Slide 34

Slide 34 text

#2 - Bank on thread safe libraries

Slide 35

Slide 35 text

Libraries ● Data structures ● JSON, XML parsing, HTTP clients etc ● Generally, auditing all the gems you use for thread safety is a good idea

Slide 36

Slide 36 text

If you only use thread safe libraries are you ‘safe for parallelism’?

Slide 37

Slide 37 text

Rails is thread safe right? Why is everyone concerned about thread safety in the first place?

Slide 38

Slide 38 text

#3 - If two libraries are thread safe, code that uses both of them need not be

Slide 39

Slide 39 text

Rails thread safety model ● Instantiate everything for every request ● No shared state (global objects) ● Different from, say, Java (single servlet object per container, IOC with singletons etc.)

Slide 40

Slide 40 text

#4 - Try and stick to Rails’ way of handling requests

Slide 41

Slide 41 text

Are you ‘Safe for parallelism’ if you follow these steps?

Slide 42

Slide 42 text

Well, it depends...

Slide 43

Slide 43 text

Validating, say, through a green bar is very hard.

Slide 44

Slide 44 text

Always give yourself some time to stabilize. The move is definitely not overnight!

Slide 45

Slide 45 text

Speaking of the move, move where?

Slide 46

Slide 46 text

Since Rubinius is mostly MRI like, its simpler

Slide 47

Slide 47 text

I personally love JRuby more because of my JVM background

Slide 48

Slide 48 text

Lots of good things have been spoken about JRuby

Slide 49

Slide 49 text

Some gotchas based on my experience

Slide 50

Slide 50 text

JRuby impacts Developers ● The JRuby startup time (mostly because of the JVM startup time) can sometimes kill red- green cycle time ● Sometimes, you should be OK with stooping down to Java code to figure out why something is not working

Slide 51

Slide 51 text

JRuby impacts OPs ● You no longer have a ruby app in prod, its a Java app ● GC tuning, Process monitoring, Profiling etc. are very different on a JVM

Slide 52

Slide 52 text

Thread Parallelism Reverse Proxy Puma Instance Database Layer Threads

Slide 53

Slide 53 text

Thank you! @harikrishnan83