Slide 1

Slide 1 text

Taming Graph Dynamics at Scale Felix Cuadrado (@felixcuadrado) Queen Mary University of London Joint work with Luis Vaquero, Dyonisios Logothetis, Claudio Martella Data Science London Meetup 10th December 2014

Slide 2

Slide 2 text

Graphs model real world sources 2

Slide 3

Slide 3 text

Large-scale graph processing distributed partitions: parallel computation communication between partitions cut edges partitioning quality α performance 3

Slide 4

Slide 4 text

Pregel’s node-centric processing model BSP SYNC BARRIER Compute, Send Messages Receive Messages BSP SYNC BARRIER GraphX [JPDC 94] [SIGMOD 10] [VLDB 12] [OSDI 14] 4

Slide 5

Slide 5 text

Example: PageRank in Apache Giraph public void compute(Vertex vertex, Iterable messages) throws IOException { if (getSuperstep() >=1 ) { double sum = 0; for (DoubleWritable message : messages) { sum += message.get(); } DoubleWritable vertexValue = new DoubleWritable((0.15f / getTotalNumVertices()) + 0.85f * sum); vertex.setValue(vertexValue); } if (getSuperstep() < MAX_SUPERSTEPS) { long edges = vertex.getNumEdges(); sendMessageToAllEdges(vertex, new DoubleWritable(vertex.getValue().get() / edges)); } else { vertex.voteToHalt(); } } 5

Slide 6

Slide 6 text

• Slowly growing • Long-life structures • Friendship graphs • The Web graph • Road Network Real-world graphs are dynamic • Rapidly evolving • Streams of events/messages • Calls, interactions, mobile proximity • Relationships decay/become stale over time 6

Slide 7

Slide 7 text

Dynamic distributed graph processing BSP Graph Processing Dynamic Graph API 7

Slide 8

Slide 8 text

Dynamic graph processing model BSP SYNC BARRIER Update Graph BSP SYNC BARRIER Compute, Send Messages Receive Messages 8

Slide 9

Slide 9 text

Partitioning dynamic graphs partition quality α execution performance 9

Slide 10

Slide 10 text

Partitioning dynamic graphs Hash Partitioning Deterministic Greedy Stream Partitioning [KDD12] 0 5 10 15 20 25 30 time elapsed (days) 0.4 0.5 0.6 0.7 0.8 ratio of cuts Interaction graph from 1 month of mobile calls data (CDR) Inactive links expire in 1 week partition quality: ratio of edges cut between partitions 10

Slide 11

Slide 11 text

Migrate where your neighbors are Iterative algorithm Migration Quotas Approach: Adaptive graph partitioning [ICDCS 14] 11

Slide 12

Slide 12 text

Adaptive migration integration BSP SYNC BARRIER Compute, Messaging Update Graph BSP SYNC BARRIER Migrate Decide Migrations 12

Slide 13

Slide 13 text

Adapting mobile call graphs adaptive hash wk1 wk2 wk3 wk4 0 2 4 6 8 10 throughput (queries per hour) Maximum clique performance week Hash partitioning Adaptive partitioning 0 5 10 15 20 25 30 time elapsed (days) 0.4 0.5 0.6 0.7 0.8 ratio of cuts Quality of partitioning Dataset: 1 month of mobile calls 21 million unique nodes 7% Addition 4% Deletion each week Sliding window of one week Algorithm: Maximum clique computation 13

Slide 14

Slide 14 text

Adapting real-time social graphs 0 10 20 30 40 50 Average Tweets per sec 0 1 2 3 4 5 Superstep time (s) 0 2 4 6 8 10 12 14 16 18 20 22 24 Time (h) Tweets per second Hash superstep time Adaptive superstep time Dataset: one week of tweets published from London in 2012 Algorithm: TunkRank (User influence metric) 14

Slide 15

Slide 15 text

• Initial partitioning is not that important with dynamic graphs • Adaptive partitioning / repartitioning might be needed • >50% performance improvement on dynamic graphs • Partitioning overhead should be considered • Smarter partition strategies might not be practical • Migrations /repartitions might not be worth it • BSP aids system optimisations • Message aggregation (migration & computation) Lessons learnt 15