Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Investigating the Scalability Limits of Distrib...

Amir Ghaffari
September 05, 2014

Investigating the Scalability Limits of Distributed Erlang

Amir Ghaffari

September 05, 2014
Tweet

More Decks by Amir Ghaffari

Other Decks in Programming

Transcript

  1. Investigating the Scalability Limits of Distributed Erlang Amir Ghaffari Thirteenth

    ACM SIGPLAN Erlang Workshop Göteborg - September 05, 2014 1
  2. Outline • RELEASE Project • The Design and Implementation of

    DE-Bench • Benchmark’s Results o Global Commands o Data Size o Data Size & Computation Time o Server Processes • Conclusion & Future work 2
  3. RELEASE project • RELEASE is an European project aiming to

    scale Erlang onto commodity architectures with 100,000 cores. 3
  4. Why Erlang? Commercial applications • Facebook chat backend • T-Mobile

    advanced call control services • Ericsson AXD301 ATM switch • Riak, CouchDB NoSQL DBMSs This popularity is due to • share-nothing concurrency • asynchronous message passing based on the actor model • process location transparency • fault tolerance • … 4
  5. RELEASE project The RELEASE consortium work at following levels: 

    Virtual machine  Language  Scalable Computation model  Scalable In-memory data structures  Scalable Persistent data structures  Infrastructure levels  Profiling and refactoring tools 5
  6. Requirement To scale distributed Erlang we needed 6 • To

    identify the scalability bottlenecks of distributed Erlang • Thu, the need for benchmarking is obvious • But, there was not such a tool and nobody has done it before • A scalable benchmarking tools for large scale architecture (with hundreds of nodes & thousands of cores)
  7. DE-Bench So, we have developed our own tool! 7 •

    DE-Bench stands for “Distributed Erlang Benchmark” • DE-Bench is based on Basho Bench, an open source benchmarking tool for Riak
  8. DE-Bench 8 For the sake of scalability, DE-Bench has a

    P2P design: • No central coordination or synchronisation • No single point of failure • All nodes perform the same role independently
  9. How DE-Bench Works 10 •Erlang supervision tree : reliability and

    fault-tolerance •8 cores on each node, and so 40 worker processes on each node •CSV files are stored on the local disk of each node to avoid disk access contention
  10. Distributed Erlang Commands P2P commands • a function with tunable

    argument size and computation time is run on a remote node. 11 • A function with argument size X bytes is called. • Then, a non-tail recursive function is run on the target node for Y microseconds. • Finally, the argument is returned to the source node as result.
  11. Distributed Erlang Commands Local Commands • Just the local node

    gets involved and no communicate with other nodes 12 Global Commands • All nodes in cluster get involved • result will be ready once the command runs successfully on all nodes.
  12. Distributed Erlang Commands P2P commands • spawn(Node, Fun): asynchronously calls

    a function at a remote node • RPC(Node, Fun): synchronously calls a function at a remote node • Server Process Call: a synchronous call to a generic server process (gen_server) or a finite state machine process (gen_fsm) 13 Target node Spawn: a new process is created RPC and Server Process: process exists Source node
  13. Distributed Erlang Commands Local commands: register_name(Name, Pid): associates a name

    with a process identifier (pid). Unregister_name(Name): removes a registered name, associated with a pid. whereis(Name): returns the pid registered with a specific name. global:whereis(Name): returns the pid associated with a specific name globally. 14
  14. Distributed Erlang Commands Global commands: global:register_name(Name, Pid): globally associates a

    name with a pid. global:unregister_name(Name): removes a globally registered name from all nodes in the cluster. 15 Register Unregister Erlang VM Erlang VM Erlang VM Global name table Global name table Global name table Global name table Erlang VM
  15. Platform & Settings 17 •The benchmark was carried on Kalkyl

    cluster at UPPMAX • 348 nodes with 2784 64-bit processor cores (8 cores per node) • 24GB RAM memory and 250GB hard disk •Red Hat Enterprise Linux 6.0 •Erlang version R16B has been used in all our experiments.
  16. Benchmarking Results 18 • The benchmark is conducted on 10,

    20, 30, 40, 50, 60, 70, 80, 90, and 100-node clusters • All the experiments run for 5 minutes. • one Erlang VM on each host and as always one DE-Bench instance on each VM. • CSV files from all participating nodes are aggregated to find out the total throughput and failures.
  17. Global Commands 19 Commands that have been used in this

    benchmark: • spawn and RPC with 10 bytes argument size and 10 microseconds computation time • Global commands, global:register_name and global:unregister_name • Local commands: global:whereis(Name)
  18. Data Size 22 spawn and RPC with •10, 100, 1000,

    and 10,000 bytes argument size •10 microseconds computation time
  19. Data Size & Computation Time 23 spawn and RPC with

    •1000 bytes argument size •1000 microseconds computation time scales linearly up to 150 nodes and 1200 cores !
  20. Latency of P2P Commands 24 RPC latency raises as cluster

    size grows Why? How does RPC work? What is the difference between RPC & Spawn?
  21. How does RPC work? 25 There is a gen_server process

    (called rex) on each Erlang VM In addition to user applications, RPC is also used by many built-in OTP modules So, it can be overloaded as a shared service
  22. Riak NoSQL DBMS 26 Our experience with Riak1.1.1 shows an

    overloaded server process limits the scalability of Riak Do server processes have scalability limitation(s)?
  23. Server Processes 27 •spawn and RPC, with 10 bytes argument

    size and 1 microsecond computation time. •gen_server and gen_fsm calls with 10 bytes argument size and 1 microsecond computation time.
  24. Conclusion  We have demonstrated that global commands are bottlenecks

    for the scalability of distributed Erlang. We improved this limitation by grouping nodes in smaller partitions. Our results reveal that the latency of RPC calls rises as cluster size grows We have shown that server processes scale well and they have the lowest latency among all P2P commands We are currently developing other scalable benchmarking applications to run on a larger architecture (i.e. Blue Gene/Q system). 29