Erlang and Akka Actors: A story of tradeoffs Pranav Rao Nov 17th 2017 LinkedIn/Twitter: @pnpranarao

Relevant Experience Led a team of 8 engineers to work on a bunch of communication oriented Elixir/Erlang applications. I now write Scala/Spark jobs to make sense of the Amazon Selection and Catalog.

Structure Quick summary of Actor model How Erlang and Akka actors differ in: Messaging and Mailboxes Scheduling Behavior Garbage Collection 3

Part 1: Quick Summary of Actor Model 4

Motivation Moore's law is no longer true We want to exploit concurrency and parallelism Multicore x Multinode The world is simply not sequential, and we want to model it. “ “ 5

What do we have so far? The OS exposes primitives: Processes and Threads. Languages add conveniences on top of these, like ThreadPools. To ensure data integrity, we have locks and CAS instruction sets. Are these good enough for modern programming? 6

If not, what do we look for? Throughput Latency Data integrity Explainability 7

Actor Model - Some theory What's an actor? Can only communicate by message passing Can have private state Can spawn other actors 9

Actor Model - Intuition Dining Philosophers: Thinking is concurrent/parallel activity Contention has been isolated to Actor mailboxes 10

Part 2: How Erlang and Akka actors differ 11

Messaging and Mailboxes 12

Messages Copy or Send-by-Reference? 13

Messages: Send by Copy Pros: Simpler process isolation Must for per-process GC Allocation is a solved problem Multi-node synchronisation Cons Memory consumption Time cost of allocation 14

Messages: Send by Reference Pros: Far lesser memory consumption Faster Cons Garbage collection will have to look at all actors Multi-node synchronisation 15

Messages on Erlang Copied and sent Everything is immutable, and can be pattern matched, so straightforward 16

Messages on Akka Passed by value reference. Primitive types are copied. Messages can be anything No enforcement of immutability Convention is to send immutable case class objects, that lend themselves to pattern matching. 17

Mailboxes Mostly one per actor Point of Contention queue and dequeue need to be ef cient Can be thought of as a n Producers: 1 Consumer problem 18

Erlang Mailboxes Are unbounded, so can cause OOM kills unless the producers are actively throttled. Support selective recieve idiomatically While it's very expressive for a lot of problems, dequeues become expensive if the number of unhandled messages increase. 19

Akka mailboxes We know the characteristics expected of a mailbox. So Akka picked the best tool tting the requirements in Java. Mailboxes are simple queues backed by a couple of standard ones de ned in util.concurrent.* There are settings on top of these queues that can be set separely from the code. 20

Con gurations: Bounded, Unbounded, Priority Dead-letter box Capacity, Timeout Queues: java.util.concurrent.ConcurrentLinkedQueue java.util.concurrent.LinkedBlockingQueue 21

bounded-mailbox { mailbox-type = "akka.dispatch.BoundedMailbox" mailbox-capacity = 1000 mailbox-push-timeout-time = 10s } 22

Is selective receive possible in Akka? * Yes, by manually managing the mailbox or by a combination of stash and become . def receive = { case Status(statusCode) => context.become(loop(statusCode)) unstashAll() case Message(message) => stash() } def loop(message: Message): Receive = { case Status(statusCode) => context.become(loop(statusCode)) case Message(message) => sender ! (message + statusCode) } 23

Actors 24

Before we get into scheduling, lets quickly see how Actors are actually de ned in both systems. In Akka, a class needs to inherit from the Actor trait and de ne some properties to be able to be instantiated as an Actor. In Erlang(Elixir), any function closure can be spawned as an actor. But I'll use a GenServer here for comparision 25

Akka philosopher class Philosopher(arbitrator: ActorRef, name: String) extends override def receive: Receive = { case Forks => eat() sender ! DoneEating think() case NoForks => think() } } 26

Elixir philosopher defmodule Philosopher do use GenServer def handle_cast(:forks, from, state) do eat() send(from, :done_eating) think() {:noreply, state} end def handle_cast(:no_forks, from, state) do think() {:noreply, state} end end 27

Scheduling 28

So we covered Messages and Mailboxes. Scheduling is the natural problem to be solved next. Why? Messages de ne control ow. Messages can be thought of stimulus. When there's a message to an actor, it needs to be scheduled ASAP. 29

Implications: Probably the strongest differentiator in terms of performance By clever scheduling, we are trying to optimize for both: * Throughput * Latency Both systems have to build on the primitives provided by the OS. 30

Scheduling in Akka Innovative use of Java primitives. Let's see how. Dispatcher: This is responsible for scheduling actors. Basically decides which Actor(along with it's message queue) will get CPU time Con gured at an ActorSystem Level. The Dispatcher could be overridden at Actor level. 31

Dispatcher What does Akka have to work with? java.util.concurrent.* functionality What is needed by Java concurrent utilities? A set of tasks( Runnable s) to run A pool of threads to run them on. ( ThreadPool ) Load balance these tasks across threads ef ciently So Akka has to convert the Actor paradigm that it has exposed into tasks and thread pools. 32

Dispatcher: How does it work? Because messages are the drivers of program state, Akka models mailboxes as Runnable s. An actor, at it's core encapsulates behavior and state. So Akka brings the actor de nition to the mailbox and runs recieve by applying current state and dequeing one message at a time. 33

So sending a message to an actor involves: appending to it's mailbox putting the mailbox onto the task queue And the Thread pool( ExecutorService ) should take care of the rest. Right? 34

Let's go back to the Akka Philosopher class Philosopher(arbitrator: ActorRef, name: String) extends override def receive: Receive = { case Forks => eat() sender ! DoneEating think() // FileIO + CPU hogging + NetworkIO case NoForks => think() //FileIO + CPU hogging + NetworkIO } } 35

Blocking calls make this model almost impossible to scale. 36

Throw more threads? Will block after a while. Other actors are starved. No message is being processed. Even otherwise, latencies go for a toss. One occasional blocking call can take down a thread in the thread pool which hits tail latencies hard. Context switching costs 37

Solution 1 Segregate actors into blocking and non-blocking. Allot them separate Dispatchers. Requires a lot of tuning Doesn't tackle the common case of occasional, quick blocking calls in Actors. 38

Solution 2 Futures def think() = { val res: Future[String] = for { page <- Future{ readFile() } insights <- Future{ processString(page) } result <- sendMail(insights) } yield result res.onComplete { case Success(result) ⇒ (arbitrator ! Hungry) case Failure(failure) ⇒ //handle this } } 39

Futures Futures can be composed. Async computations, expressed sequentially. Futures are guaranteed to give a result. Either they timeout, or the computation succeeds or fails. These are executed in the Dispatcher's ExecutionContext Considering our receive task is now being split into subtasks, how should our threadpool behave optimally? 40

If the threadpool was just 1. TaskQueue 2. Threads Tasks that generated subtasks could get queued up on the same thread, leading to other threads remaining idle. 41

ForkJoinPool The main difference from a *ThreadPool is threads are capable of work-stealing. Perfect for non-long running tasks - all threads (and hence all cores) should be kept busy. 42

Dispatchers are con gurable Threadpool to use: fork-join-executor , thread-pool-executor affinity-pool-executor Pinned Dispatcher Creates 1 thread per actor Driven by thread-pool-executor 43

configuration { default-dispatcher { type = Dispatcher executor = "thread-pool-executor" thread-pool-executor { core-pool-size-min = 3 core-pool-size-factor = 2.0 core-pool-size-max = 6 } mailbox-type = "" mailbox-capacity = -1 throughput = 100 <---- Tunable } } 44

Futures gotchas sender() Mutable state 45

Scheduling in Erlang 46

Quite similar to the design we just discussed! A scheduler thread hogs each core, and owns a task queue of actor processes. The scheduler steals work from other schedulers 47

What's different? Reductions IO pool Basically every piece of code is for how much CPU time/ Contention costs it's going to cost. 48

Behavior 49

Primitives to accomplish common tasks Ensure reasonability/readability 50

Erlang Behaviors OTP makes real-world applications a breeze to write. The GenServer abstraction covers: Sync calls Async calls State maintenance Init, Cleanup and Supervision 51

Akka Behaviors Strong conventions, lower level of abstraction compared to OTP. 52

Frequent tasks: Async calls: actor.tell(message) from inside an actor. Because this is the Java/Scala world, an actor could get a message from a non-actor thread. So replies sent to dead-letters mailbox. 53

Frequent tasks: Sync calls: ask(actorRef, message) Returns a Future[Any] which needs to be mapTo[T] as required. Can thus be used for blocking and getting sync behavior by the calling thread. Recommended way to interact with an actor system from a non-actor thread. 54

Observation: An actor can't really differentiate between a client that's waiting in sync and an async client. Whereas with handle_info and handle_call the actor knows this. 55

Frequent tasks: State Management: become and unbecome preStart , preRestart lifecycle hooks. 56

Garbage Collection 57

We've restricted the paradigms of programming possible with the Actor model. Can this be exploited to make GC ef cient? Erlang - Yes Akka - Not particularly 58

Akka - GC Common heap for all Actor processes' state Common heap for all mailbox Messages, though immutable (not enforced) are copied by reference. All threads have access to full memory space. GC contention with worker threads. No Actor-world speci c triggers to run GC 59

Issues p95 latencies could take a hit when stop-the- world GC runs. Requires tuning to hit the right balance of throughput, latency and memory usage. Black Magic :) 60

On the brighter side: Generational GC is cheap, most common Actor processes generate garbage that is easily picked up. 61

Erlang - GC Aggressively exploits properties that are only true in Actor systems to make GC ef cient and simpler. GC is triggered per-process only if required. Thus GC pauses are barely noticeable across the system. 62

Made possible by: Each process maintains state in it's own heap. Messages of a process are all copied to it's mailbox. No references (Except for large binaries). Even per-process garbage collection is pre- empted after reduction budget! 63

Conclusion 64

Akka Choose Akka if throughput is important in your domain and tail latencies are not deal breakers. The actor model in Akka can be leaky due to mutable references and blocking calls and it's workarounds. Needs programmer attention. 65

Erlang Choose Erlang if fairness is important and you want easy reasonability of performance Throughput could take a hit due to conscious design decisions, and the lack of a JIT makes it even more apparent for certain taks. 66

Questions? 67