Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Erlang and Akka Actors: A story of tradeoffs

Pranav Rao
November 17, 2017

Erlang and Akka Actors: A story of tradeoffs

The Actor model has long been known to be great at modeling concurrent and parallel problems in a declarative, safe way. The challenge has always been the cost of this abstraction - implementing cheap message passing over a shared-nothing memory architecture while ensuring fairness is a tough ask.

In this talk I shall go over how BEAM, with it's three decades of esoteric telecom engineering differs from Akka actors implemented over the JVM, arguably the most invested general purpose VM in existence today.

Along the way, we shall discover how design decisions affect performance at each step of a program's execution - from a global/per-process heap affecting GC latencies and throughput to preemptive scheduling improving long tail latencies.

Pranav Rao

November 17, 2017
Tweet

Other Decks in Programming

Transcript

  1. Erlang and Akka Actors: A story of tradeoffs Pranav Rao

    Nov 17th 2017 LinkedIn/Twitter: @pnpranarao
  2. Relevant Experience Housing.com: Led a team of 8 engineers to

    work on a bunch of communication oriented Elixir/Erlang applications. Amazon.com: I now write Scala/Spark jobs to make sense of the Amazon Selection and Catalog.
  3. Structure Quick summary of Actor model How Erlang and Akka

    actors differ in: Messaging and Mailboxes Scheduling Behavior Garbage Collection 3
  4. Motivation Moore's law is no longer true We want to

    exploit concurrency and parallelism Multicore x Multinode The world is simply not sequential, and we want to model it. “ “ 5
  5. What do we have so far? The OS exposes primitives:

    Processes and Threads. Languages add conveniences on top of these, like ThreadPools. To ensure data integrity, we have locks and CAS instruction sets. Are these good enough for modern programming? 6
  6. 8

  7. Actor Model - Some theory What's an actor? Can only

    communicate by message passing Can have private state Can spawn other actors 9
  8. Messages: Send by Copy Pros: Simpler process isolation Must for

    per-process GC Allocation is a solved problem Multi-node synchronisation Cons Memory consumption Time cost of allocation 14
  9. Messages: Send by Reference Pros: Far lesser memory consumption Faster

    Cons Garbage collection will have to look at all actors Multi-node synchronisation 15
  10. Messages on Erlang Copied and sent Everything is immutable, and

    can be pattern matched, so straightforward 16
  11. Messages on Akka Passed by value reference. Primitive types are

    copied. Messages can be anything No enforcement of immutability Convention is to send immutable case class objects, that lend themselves to pattern matching. 17
  12. Mailboxes Mostly one per actor Point of Contention queue and

    dequeue need to be ef cient Can be thought of as a n Producers: 1 Consumer problem 18
  13. Erlang Mailboxes Are unbounded, so can cause OOM kills unless

    the producers are actively throttled. Support selective recieve idiomatically While it's very expressive for a lot of problems, dequeues become expensive if the number of unhandled messages increase. 19
  14. Akka mailboxes We know the characteristics expected of a mailbox.

    So Akka picked the best tool tting the requirements in Java. Mailboxes are simple queues backed by a couple of standard ones de ned in util.concurrent.* There are settings on top of these queues that can be set separely from the code. 20
  15. Con gurations: Bounded, Unbounded, Priority Dead-letter box Capacity, Timeout Queues:

    java.util.concurrent.ConcurrentLinkedQueue java.util.concurrent.LinkedBlockingQueue 21
  16. Is selective receive possible in Akka? * Yes, by manually

    managing the mailbox or by a combination of stash and become . def receive = { case Status(statusCode) => context.become(loop(statusCode)) unstashAll() case Message(message) => stash() } def loop(message: Message): Receive = { case Status(statusCode) => context.become(loop(statusCode)) case Message(message) => sender ! (message + statusCode) } 23
  17. Before we get into scheduling, lets quickly see how Actors

    are actually de ned in both systems. In Akka, a class needs to inherit from the Actor trait and de ne some properties to be able to be instantiated as an Actor. In Erlang(Elixir), any function closure can be spawned as an actor. But I'll use a GenServer here for comparision 25
  18. Akka philosopher class Philosopher(arbitrator: ActorRef, name: String) extends override def

    receive: Receive = { case Forks => eat() sender ! DoneEating think() case NoForks => think() } } 26
  19. Elixir philosopher defmodule Philosopher do use GenServer def handle_cast(:forks, from,

    state) do eat() send(from, :done_eating) think() {:noreply, state} end def handle_cast(:no_forks, from, state) do think() {:noreply, state} end end 27
  20. So we covered Messages and Mailboxes. Scheduling is the natural

    problem to be solved next. Why? Messages de ne control ow. Messages can be thought of stimulus. When there's a message to an actor, it needs to be scheduled ASAP. 29
  21. Implications: Probably the strongest differentiator in terms of performance By

    clever scheduling, we are trying to optimize for both: * Throughput * Latency Both systems have to build on the primitives provided by the OS. 30
  22. Scheduling in Akka Innovative use of Java primitives. Let's see

    how. Dispatcher: This is responsible for scheduling actors. Basically decides which Actor(along with it's message queue) will get CPU time Con gured at an ActorSystem Level. The Dispatcher could be overridden at Actor level. 31
  23. Dispatcher What does Akka have to work with? java.util.concurrent.* functionality

    What is needed by Java concurrent utilities? A set of tasks( Runnable s) to run A pool of threads to run them on. ( ThreadPool ) Load balance these tasks across threads ef ciently So Akka has to convert the Actor paradigm that it has exposed into tasks and thread pools. 32
  24. Dispatcher: How does it work? Because messages are the drivers

    of program state, Akka models mailboxes as Runnable s. An actor, at it's core encapsulates behavior and state. So Akka brings the actor de nition to the mailbox and runs recieve by applying current state and dequeing one message at a time. 33
  25. So sending a message to an actor involves: appending to

    it's mailbox putting the mailbox onto the task queue And the Thread pool( ExecutorService ) should take care of the rest. Right? 34
  26. Let's go back to the Akka Philosopher class Philosopher(arbitrator: ActorRef,

    name: String) extends override def receive: Receive = { case Forks => eat() sender ! DoneEating think() // FileIO + CPU hogging + NetworkIO case NoForks => think() //FileIO + CPU hogging + NetworkIO } } 35
  27. Throw more threads? Will block after a while. Other actors

    are starved. No message is being processed. Even otherwise, latencies go for a toss. One occasional blocking call can take down a thread in the thread pool which hits tail latencies hard. Context switching costs 37
  28. Solution 1 Segregate actors into blocking and non-blocking. Allot them

    separate Dispatchers. Requires a lot of tuning Doesn't tackle the common case of occasional, quick blocking calls in Actors. 38
  29. Solution 2 Futures def think() = { val res: Future[String]

    = for { page <- Future{ readFile() } insights <- Future{ processString(page) } result <- sendMail(insights) } yield result res.onComplete { case Success(result) ⇒ (arbitrator ! Hungry) case Failure(failure) ⇒ //handle this } } 39
  30. Futures Futures can be composed. Async computations, expressed sequentially. Futures

    are guaranteed to give a result. Either they timeout, or the computation succeeds or fails. These are executed in the Dispatcher's ExecutionContext Considering our receive task is now being split into subtasks, how should our threadpool behave optimally? 40
  31. If the threadpool was just 1. TaskQueue 2. Threads Tasks

    that generated subtasks could get queued up on the same thread, leading to other threads remaining idle. 41
  32. ForkJoinPool The main difference from a *ThreadPool is threads are

    capable of work-stealing. Perfect for non-long running tasks - all threads (and hence all cores) should be kept busy. 42
  33. Dispatchers are con gurable Threadpool to use: fork-join-executor , thread-pool-executor

    affinity-pool-executor Pinned Dispatcher Creates 1 thread per actor Driven by thread-pool-executor 43
  34. configuration { default-dispatcher { type = Dispatcher executor = "thread-pool-executor"

    thread-pool-executor { core-pool-size-min = 3 core-pool-size-factor = 2.0 core-pool-size-max = 6 } mailbox-type = "" mailbox-capacity = -1 throughput = 100 <---- Tunable } } 44
  35. Quite similar to the design we just discussed! A scheduler

    thread hogs each core, and owns a task queue of actor processes. The scheduler steals work from other schedulers 47
  36. What's different? Reductions IO pool Basically every piece of code

    is for how much CPU time/ Contention costs it's going to cost. 48
  37. Erlang Behaviors OTP makes real-world applications a breeze to write.

    The GenServer abstraction covers: Sync calls Async calls State maintenance Init, Cleanup and Supervision 51
  38. Frequent tasks: Async calls: actor.tell(message) from inside an actor. Because

    this is the Java/Scala world, an actor could get a message from a non-actor thread. So replies sent to dead-letters mailbox. 53
  39. Frequent tasks: Sync calls: ask(actorRef, message) Returns a Future[Any] which

    needs to be mapTo[T] as required. Can thus be used for blocking and getting sync behavior by the calling thread. Recommended way to interact with an actor system from a non-actor thread. 54
  40. Observation: An actor can't really differentiate between a client that's

    waiting in sync and an async client. Whereas with handle_info and handle_call the actor knows this. 55
  41. We've restricted the paradigms of programming possible with the Actor

    model. Can this be exploited to make GC ef cient? Erlang - Yes Akka - Not particularly 58
  42. Akka - GC Common heap for all Actor processes' state

    Common heap for all mailbox Messages, though immutable (not enforced) are copied by reference. All threads have access to full memory space. GC contention with worker threads. No Actor-world speci c triggers to run GC 59
  43. Issues p95 latencies could take a hit when stop-the- world

    GC runs. Requires tuning to hit the right balance of throughput, latency and memory usage. Black Magic :) 60
  44. On the brighter side: Generational GC is cheap, most common

    Actor processes generate garbage that is easily picked up. 61
  45. Erlang - GC Aggressively exploits properties that are only true

    in Actor systems to make GC ef cient and simpler. GC is triggered per-process only if required. Thus GC pauses are barely noticeable across the system. 62
  46. Made possible by: Each process maintains state in it's own

    heap. Messages of a process are all copied to it's mailbox. No references (Except for large binaries). Even per-process garbage collection is pre- empted after reduction budget! 63
  47. Akka Choose Akka if throughput is important in your domain

    and tail latencies are not deal breakers. The actor model in Akka can be leaky due to mutable references and blocking calls and it's workarounds. Needs programmer attention. 65
  48. Erlang Choose Erlang if fairness is important and you want

    easy reasonability of performance Throughput could take a hit due to conscious design decisions, and the lack of a JIT makes it even more apparent for certain taks. 66