Sane Sharding with Akka Cluster

Sane Sharding with Akka Cluster

Writing distributed applications is very hard, especially when you start developing them as single-noded ones. Programmers tend to focus on functionalities first, leaving the scalability issues for later. Fortunately, Akka gives us many tools for scaling out and we can use them very early in the development process. I want to show you how to take advantage of these features.

You will learn how to transform a single-noded app into a scalable one. During live coding session I will create both versions from scratch and guide you through the most important architectural decisions. If you are interested in scalability and know basics of message-based concurrency, this talk is for you. First, you will see how to create a web service as a single-noded Akka app. Then we will talk about scalability and availability problems with this approach and introduce sharding as potential solution. We will use this knowledge and Akka Cluster module to make our app more scalable.

6f6dc1b13fd3fe35d36db3adafcb0c8e?s=128

Michał Płachta

December 01, 2015
Tweet

Transcript

  1. 2.

    @miciek What’s inside? • Creating a web service using actor

    model • ...analysing its performance • ...making it scalable
  2. 3.

    @miciek Akka Tutorial • actor ~= lightweight thread • actorRef.tell

    • actorRef.ask • actors create children • actors have mailbox ActorRef Actor 1 Actor 2 ask tell enqueue Mailbox Actor 3 dequeue
  3. 4.

    @miciek Scala Tutorial case class Junction(id: Int) public class Junction

    { private final int id; public Junction(int id) { this.id = id; } public int getId() { return id; } // hashCode // equals // copy } msg match { case Junction(id) => { // this will execute when // msg is instanceOf Junction } case SomeOtherType => {} } JAVA SCALA
  4. 5.

    @miciek Our example: Sorter scan <containerId> HTTP push right or

    not See also: http://i.imgur.com/mctb4HC.gifv
  5. 6.

    @miciek Sorter Web Service http://localhost:8080/junctions/<junctionId>/decisionForContainer/<containerId> returns JSON { “targetConveyor”: <conveyorId>

    } Assumptions: • business logic already defined - focus on performance • the business logic function takes 5-10 ms to make a decision
  6. 8.

    @miciek Step 1: Just REST... RestInterface HTTP Requests HTTP Responses

    • One Actor = One Thread • Blocking inside receive method • Low throughput...
  7. 9.

    @miciek Throughput testing /junctions/1/decisionForContainer/1 /junctions/2/decisionForContainer/4 /junctions/3/decisionForContainer/5 /junctions/4/decisionForContainer/2 /junctions/5/decisionForContainer/7 2000 requests

    2000 requests 2000 requests 2000 requests 2000 requests in parallel cat URLs.txt | parallel -j 5 'ab -ql -n 2000 -c 1 -k {}' GNU Parallel ApacheBench
  8. 11.

    @miciek Step 1: Just REST... RestInterface HTTP Requests HTTP Responses

    ± % cat URLs.txt | parallel -j 5 'ab -ql -n 2000 -c 1 -k {}' | grep 'Requests per second' Requests per second: 34.78 [#/sec] (mean) Requests per second: 34.22 [#/sec] (mean) Requests per second: 33.77 [#/sec] (mean) Requests per second: 33.82 [#/sec] (mean) Requests per second: 33.98 [#/sec] (mean)
  9. 14.

    @miciek Step 3: One actor per junction RestInterface HTTP Requests

    HTTP Responses DecidersGuardian SortingDecider SortingDecider SortingDecider <junctionId>=1 ... <junctionId>=5
  10. 15.

    @miciek Step 3: One actor per junction ± % cat

    URLs.txt | parallel -j 5 'ab -ql -n 2000 -c 1 -k {}' | grep 'Requests per second' Requests per second: 67.36 [#/sec] (mean) Requests per second: 69.03 [#/sec] (mean) Requests per second: 67.75 [#/sec] (mean) Requests per second: 66.88 [#/sec] (mean) Requests per second: 66.28 [#/sec] (mean)
  11. 16.

    @miciek Now what? • non-blocking • concurrent • scaling up

    works • scaling out? RestInterface HTTP Requests HTTP Responses DecidersGuardian SortingDecider SortingDecider SortingDecider <junctionId>=1 ... <junctionId>=5
  12. 17.

    @miciek Manual scaling out RestInterface HTTP Requests HTTP Responses DecidersGuardian

    SortingDecider SortingDecider SortingDecider <junctionId>=1 ... <junctionId>=3 RestInterface HTTP Requests HTTP Responses DecidersGuardian SortingDecider SortingDecider SortingDecider <junctionId>=4 ... <junctionId>=6
  13. 18.

    @miciek Enter Sharding RestInterface HTTP Requests HTTP Responses ShardRegion SortingDecider

    SortingDecider SortingDecider <junctionId>=h(m) ... <junctionId>=h(m) RestInterface HTTP Requests HTTP Responses ShardRegion SortingDecider SortingDecider SortingDecider <junctionId>=h(m) ... <junctionId>=h(m) ...
  14. 20.

    @miciek Step 4: Sharded web service ± % cat URLs.txt

    | parallel -j 5 'ab -ql -n 2000 -c 1 -k {}' | grep 'Requests per second' Requests per second: 106.80 [#/sec] (mean) Requests per second: 108.15 [#/sec] (mean) Requests per second: 100.60 [#/sec] (mean) Requests per second: 99.92 [#/sec] (mean) Requests per second: 100.07 [#/sec] (mean)
  15. 21.

    @miciek Sharding • automatic distribution • no need to know

    who is where • no need to know how many nodes are there More: • The project: github.com/miciek/akka-sharding-example • Step by step tutorial: michalplachta.com/2016/01/23/scalability- using-sharding-from-akka-cluster/
  16. 22.

    @miciek Thank you! Michał Płachta @miciek Sane Sharding with Akka

    Cluster Live coding & performance analysis