Slide 1

Slide 1 text

Distributed Graph Processing with Scala and Akka Adelbert Chang Saturday, August 3, 13

Slide 2

Slide 2 text

About Me Saturday, August 3, 13

Slide 3

Slide 3 text

About Me •4th year student @ UC Santa Barbara •BS/MS Computer Science Saturday, August 3, 13

Slide 4

Slide 4 text

About Me •4th year student @ UC Santa Barbara •BS/MS Computer Science •Research Assistant •Large-scale graph mining and modeling •Cluster Computing Saturday, August 3, 13

Slide 5

Slide 5 text

About Me •4th year student @ UC Santa Barbara •BS/MS Computer Science •Research Assistant •Large-scale graph mining and modeling •Cluster Computing •Engineering Analytics Intern @ Box Saturday, August 3, 13

Slide 6

Slide 6 text

About Me •4th year student @ UC Santa Barbara •BS/MS Computer Science •Research Assistant •Large-scale graph mining and modeling •Cluster Computing •Engineering Analytics Intern @ Box •Scala since January 2012 Saturday, August 3, 13

Slide 7

Slide 7 text

Outline Saturday, August 3, 13

Slide 8

Slide 8 text

Outline •Motivation Saturday, August 3, 13

Slide 9

Slide 9 text

Outline •Motivation •Context and Assumptions Saturday, August 3, 13

Slide 10

Slide 10 text

Outline •Motivation •Context and Assumptions •User and System Requirements Saturday, August 3, 13

Slide 11

Slide 11 text

Outline •Motivation •Context and Assumptions •User and System Requirements •Solution Saturday, August 3, 13

Slide 12

Slide 12 text

Outline •Motivation •Context and Assumptions •User and System Requirements •Solution •Live Demo! Saturday, August 3, 13

Slide 13

Slide 13 text

Motivation Saturday, August 3, 13

Slide 14

Slide 14 text

Motivation •Many of our algorithms are embarassingly parallel •Pregel model is good, but too heavy for us Saturday, August 3, 13

Slide 15

Slide 15 text

Motivation •Many of our algorithms are embarassingly parallel •Pregel model is good, but too heavy for us •Example: Shortest path •Split work on nodes •Run BFS, return a Map[Int, Int] Saturday, August 3, 13

Slide 16

Slide 16 text

Context + Assumptions Saturday, August 3, 13

Slide 17

Slide 17 text

Context + Assumptions •Studying large-scale static graphs, typically those found in online social networks Saturday, August 3, 13

Slide 18

Slide 18 text

Context + Assumptions •Studying large-scale static graphs, typically those found in online social networks •Cluster of around 30 machines Saturday, August 3, 13

Slide 19

Slide 19 text

Context + Assumptions •Studying large-scale static graphs, typically those found in online social networks •Cluster of around 30 machines •Cluster shares a file system Saturday, August 3, 13

Slide 20

Slide 20 text

Context + Assumptions •Studying large-scale static graphs, typically those found in online social networks •Cluster of around 30 machines •Cluster shares a file system •Graphs are large, but can fit into machine machine memory Saturday, August 3, 13

Slide 21

Slide 21 text

Context + Assumptions •Studying large-scale static graphs, typically those found in online social networks •Cluster of around 30 machines •Cluster shares a file system •Graphs are large, but can fit into machine machine memory •We want “raw” results dumped straight to disk Saturday, August 3, 13

Slide 22

Slide 22 text

User Requirements Saturday, August 3, 13

Slide 23

Slide 23 text

User Requirements •Users should Saturday, August 3, 13

Slide 24

Slide 24 text

User Requirements •Users should •Not have to interact with Akka Saturday, August 3, 13

Slide 25

Slide 25 text

User Requirements •Users should •Not have to interact with Akka •Only need to define the algorithm and the input Saturday, August 3, 13

Slide 26

Slide 26 text

User Requirements •Users should •Not have to interact with Akka •Only need to define the algorithm and the input •Be able to put an upper bound on number of threads per machine Saturday, August 3, 13

Slide 27

Slide 27 text

System Requirements Saturday, August 3, 13

Slide 28

Slide 28 text

System Requirements •The system should Saturday, August 3, 13

Slide 29

Slide 29 text

System Requirements •The system should •Be easy to deploy without any cluster setup Saturday, August 3, 13

Slide 30

Slide 30 text

System Requirements •The system should •Be easy to deploy without any cluster setup •Be fault tolerant Saturday, August 3, 13

Slide 31

Slide 31 text

System Requirements •The system should •Be easy to deploy without any cluster setup •Be fault tolerant •Be elastic Saturday, August 3, 13

Slide 32

Slide 32 text

System Requirements •The system should •Be easy to deploy without any cluster setup •Be fault tolerant •Be elastic •Graph should be loaded locally Saturday, August 3, 13

Slide 33

Slide 33 text

System Requirements •The system should •Be easy to deploy without any cluster setup •Be fault tolerant •Be elastic •Graph should be loaded locally •Clean up and shut itself down afterwards Saturday, August 3, 13

Slide 34

Slide 34 text

Inspiration Saturday, August 3, 13

Slide 35

Slide 35 text

Inspiration Saturday, August 3, 13

Slide 36

Slide 36 text

Inspiration Saturday, August 3, 13

Slide 37

Slide 37 text

•Scala + Akka to the rescue! Inspiration Saturday, August 3, 13

Slide 38

Slide 38 text

Inspiration Saturday, August 3, 13

Slide 39

Slide 39 text

Inspiration •We want a balancing dispatcher for remoting Saturday, August 3, 13

Slide 40

Slide 40 text

Inspiration •We want a balancing dispatcher for remoting •Proxy mailbox is backed by a number of Actors Saturday, August 3, 13

Slide 41

Slide 41 text

Inspiration •We want a balancing dispatcher for remoting •Proxy mailbox is backed by a number of Actors •Messages are sent to a proxy mailbox Saturday, August 3, 13

Slide 42

Slide 42 text

Inspiration •We want a balancing dispatcher for remoting •Proxy mailbox is backed by a number of Actors •Messages are sent to a proxy mailbox •Messages distributed to idle Actors Saturday, August 3, 13

Slide 43

Slide 43 text

Balancing Dispatcher http://letitcrash.com/post/29044669086/balancing-workload-across-nodes-with-akka-2 Saturday, August 3, 13

Slide 44

Slide 44 text

Solution Saturday, August 3, 13

Slide 45

Slide 45 text

Solution •Design the system to act similarly to a balancing dispatcher Saturday, August 3, 13

Slide 46

Slide 46 text

Solution •Design the system to act similarly to a balancing dispatcher •A single Actor (Master) represents the dispatcher Saturday, August 3, 13

Slide 47

Slide 47 text

Solution •Design the system to act similarly to a balancing dispatcher •A single Actor (Master) represents the dispatcher •Each remote Actor (Worker) has it’s own mailbox Saturday, August 3, 13

Slide 48

Slide 48 text

Solution •Design the system to act similarly to a balancing dispatcher •A single Actor (Master) represents the dispatcher •Each remote Actor (Worker) has it’s own mailbox •Workers report to Masters when idle Saturday, August 3, 13

Slide 49

Slide 49 text

Design Decision Saturday, August 3, 13

Slide 50

Slide 50 text

Design Decision •Akka is capable of both remote lookup and remote deployment Saturday, August 3, 13

Slide 51

Slide 51 text

Design Decision •Akka is capable of both remote lookup and remote deployment •Remote Deployment Saturday, August 3, 13

Slide 52

Slide 52 text

Design Decision •Akka is capable of both remote lookup and remote deployment •Remote Deployment •Master becomes connected to Worker automatically Saturday, August 3, 13

Slide 53

Slide 53 text

Design Decision •Akka is capable of both remote lookup and remote deployment •Remote Deployment •Master becomes connected to Worker automatically •Remote lookup Saturday, August 3, 13

Slide 54

Slide 54 text

Design Decision •Akka is capable of both remote lookup and remote deployment •Remote Deployment •Master becomes connected to Worker automatically •Remote lookup •Workers can be added/killed at runtime Saturday, August 3, 13

Slide 55

Slide 55 text

High-Level Design http://letitcrash.com/post/29044669086/balancing-workload-across-nodes-with-akka-2 Saturday, August 3, 13

Slide 56

Slide 56 text

High-Level Design http://letitcrash.com/post/29044669086/balancing-workload-across-nodes-with-akka-2 Saturday, August 3, 13

Slide 57

Slide 57 text

Master Saturday, August 3, 13

Slide 58

Slide 58 text

Master Saturday, August 3, 13

Slide 59

Slide 59 text

Master Saturday, August 3, 13

Slide 60

Slide 60 text

Master Saturday, August 3, 13

Slide 61

Slide 61 text

Master Saturday, August 3, 13

Slide 62

Slide 62 text

Master Saturday, August 3, 13

Slide 63

Slide 63 text

Master Saturday, August 3, 13

Slide 64

Slide 64 text

Master Saturday, August 3, 13

Slide 65

Slide 65 text

Master Saturday, August 3, 13

Slide 66

Slide 66 text

Master Saturday, August 3, 13

Slide 67

Slide 67 text

Worker Saturday, August 3, 13

Slide 68

Slide 68 text

Worker Saturday, August 3, 13

Slide 69

Slide 69 text

Worker Saturday, August 3, 13

Slide 70

Slide 70 text

Worker Saturday, August 3, 13

Slide 71

Slide 71 text

Worker Saturday, August 3, 13

Slide 72

Slide 72 text

Worker Saturday, August 3, 13

Slide 73

Slide 73 text

Worker Saturday, August 3, 13

Slide 74

Slide 74 text

Worker Saturday, August 3, 13

Slide 75

Slide 75 text

Worker Saturday, August 3, 13

Slide 76

Slide 76 text

Worker Saturday, August 3, 13

Slide 77

Slide 77 text

Worker Saturday, August 3, 13

Slide 78

Slide 78 text

Worker Saturday, August 3, 13

Slide 79

Slide 79 text

Worker Saturday, August 3, 13

Slide 80

Slide 80 text

Worker Saturday, August 3, 13

Slide 81

Slide 81 text

Sabre Saturday, August 3, 13

Slide 82

Slide 82 text

Application Saturday, August 3, 13

Slide 83

Slide 83 text

Application Application Sabre Master ResultHandler Saturday, August 3, 13

Slide 84

Slide 84 text

Application Application Sabre Master ResultHandler Sabre.execute() Saturday, August 3, 13

Slide 85

Slide 85 text

Application Application Sabre Master ResultHandler Sabre.execute() system.actorOf Saturday, August 3, 13

Slide 86

Slide 86 text

Application Application Sabre Master ResultHandler Sabre.execute() system.actorOf system.actorOf Saturday, August 3, 13

Slide 87

Slide 87 text

Application Application Sabre Master ResultHandler Sabre.execute() system.actorOf system.actorOf Worker Worker Saturday, August 3, 13

Slide 88

Slide 88 text

Application Application Sabre Master ResultHandler Sabre.execute() system.actorOf system.actorOf Worker Worker WorkerCreated Saturday, August 3, 13

Slide 89

Slide 89 text

Application Application Sabre Master ResultHandler Sabre.execute() system.actorOf system.actorOf Worker Worker DoAlgorithm Application Sabre Master ResultHandler Worker Worker Saturday, August 3, 13

Slide 90

Slide 90 text

Application Application Sabre Master ResultHandler Sabre.execute() system.actorOf system.actorOf Worker Worker WorkIsReady Application Sabre Master ResultHandler Worker Worker Saturday, August 3, 13

Slide 91

Slide 91 text

Application Application Sabre Master ResultHandler Sabre.execute() system.actorOf system.actorOf Worker Worker WorkerRequestsWork Application Sabre Master ResultHandler Worker Worker Saturday, August 3, 13

Slide 92

Slide 92 text

Application Application Sabre Master ResultHandler Sabre.execute() system.actorOf system.actorOf Worker Worker WorkToBeDone Application Sabre Master ResultHandler Worker Worker Saturday, August 3, 13

Slide 93

Slide 93 text

Application Application Sabre Master ResultHandler Sabre.execute() system.actorOf system.actorOf Worker Worker Application Sabre Master ResultHandler Worker Worker Saturday, August 3, 13

Slide 94

Slide 94 text

Application Application Sabre Master ResultHandler Sabre.execute() system.actorOf system.actorOf Worker Worker HandleResult Application Sabre Master ResultHandler Worker Worker Saturday, August 3, 13

Slide 95

Slide 95 text

Application Application Sabre Master ResultHandler Sabre.execute() system.actorOf system.actorOf Worker Worker WorkComplete Application Sabre Master ResultHandler Worker Worker Saturday, August 3, 13

Slide 96

Slide 96 text

Application Application Sabre Master ResultHandler Sabre.execute() system.actorOf system.actorOf Worker Worker WorkIsDone Application Sabre Master ResultHandler Worker Worker Saturday, August 3, 13

Slide 97

Slide 97 text

Application Application Sabre Master ResultHandler Sabre.execute() system.actorOf system.actorOf Worker Worker WorkIsDone Worker Worker Worker Application Sabre Master ResultHandler Worker Worker Saturday, August 3, 13

Slide 98

Slide 98 text

Future Work Saturday, August 3, 13

Slide 99

Slide 99 text

Future Work •Typed channels Saturday, August 3, 13

Slide 100

Slide 100 text

Future Work •Typed channels •Akka Clustering Saturday, August 3, 13

Slide 101

Slide 101 text

Future Work •Typed channels •Akka Clustering •Typesafe Developer Console Saturday, August 3, 13

Slide 102

Slide 102 text

Live Demo! Saturday, August 3, 13

Slide 103

Slide 103 text

EOF @adelbertchang [email protected] Saturday, August 3, 13

Slide 104

Slide 104 text

EOF @adelbertchang [email protected] Questions? Saturday, August 3, 13