Time-Evolving Graph Processing at Scale

Time-Evolving Graph Processing at Scale

0ff46442256bf55681d64027c68beea7?s=128

Anand Iyer

June 24, 2016
Tweet

Transcript

  1. Time-Evolving Graph Processing at Scale Anand Iyer#, Li Erran Li+,

    Tathagata Das*, Ion Stoica#* #UC Berkeley +Uber Technologies *Databricks
  2. Motivation Dynamically evolving graphs prevalent in many domains – Social

    networks (e.g., Twitter, Facebook) – Communication networks (e.g. cellular networks) – Internet-of-Things
  3. Motivation Many applications need to leverage the evolution characteristics –

    Product recommendations – Network troubleshooting – Real-time ad placement
  4. Motivation Lots of interest in distributed graph processing… – GraphX,

    Girafe, Powergraph, GraphLab, GraphChi, Chaos, … …but existing graph processing engines offer little support for dynamic graphs – Some specialized systems exist. E.g., Kineograph, Chronos, not generic enough
  5. Challenges • Consistent & fault-tolerant snapshot generation • Co-ordinate snapshot

    generation and computation • Window operations on snapshots • Mix data and graph parallel computations Existing solutions do not satisfy all the requirements
  6. GraphTau Abstraction Computational Model b d c e a d

    0.556 2.39 0.557 0.557 0.968 0.977 Iteration N b d c e a d 0.556 2.39 0.557 0.557 0.968 0.977 Pause & Shift b d c e a d 0.502 2.07 0.502 0.849 1.224 0.849 Continue from N a b c d e a d x b c d e b d Use vertex state a e d c a b e d c a b e d c f
  7. GraphTau a e d c a b e d c

    f a b e d c t1 t2 t3 GraphTau represents time-evolving graphs as a series of consistent graph snapshots
  8. New Computational Models Two new models for processing time-evolving graphs

    Pause Shift Resume Online Rectification
  9. Pause-Shift-Resume Many graph algorithms robust to changes in graph before

    convergence E.g. PageRank: pause iterating, update snapshot, continue iterating b d c e a d 0.556 2.39 0.557 0.557 0.968 0.977 Iteration N b d c e a d 0.556 2.39 0.557 0.557 0.968 0.977 Pause & Shift b d c e a d 0.502 2.07 0.502 0.849 1.224 0.849 Continue from N
  10. Pause-Shift-Resume B C A D F E A D D

    B C D E A A F B C A D F E A D D B C D E A A F Transition (0.977, 0.968) (X , Y): X is 10 iteration PageRank Y is 23 iteration PageRank After 11 iteration on graph 2, Both converge to 3-digit precision (0.977, 0.968) (0.571, 0.556) 1.224 0.849 0.502 (2.33, 2.39) 2.07 0.849 0.502 (0.571, 0.556) (0.571, 0.556)
  11. Online Rectification Model Many graph algorithms not resilient to changes

    Need to keep per-vertex state to handle changes Connected components on an evolving graph can be done if each vertex stores its component a b c d e a d x b c d e b d Use vertex state
  12. Abstraction GraphStream[V,E]: Represents a series of Graph[V,E] snapshots where V

    = vertices, E = edges Graph[V ,E] @ T = 1 Graph[V ,E] @ T = 2 Graph[V ,E] @ T = 3 Graph[V ,E] @ T = 4 GraphStream[V,E]
  13. Operations: transform class GraphStream { def transform(func: Graph => Graph):

    GraphStream } func: User provided function to do bulk operations on vertices and edges to create a new graph, allows aggregations over vertices and edges transform:Applies func over each snapshot Graphs in a GraphStream
  14. Operations: transform class GraphStream { def transform(func: Graph => Graph):

    GraphStream } T = 1 T = 2 T = 3 T = 4 Original GraphStream Transformed GraphStream func func func func
  15. Operations: sliding windows T = 1 T = 2 T

    = 3 T = 4 Original GraphStream Windowed GraphStream class GraphStream { def mergeWindows( aggregationFuncs, windowLength, slidingInterval): GraphStream } aggregationFuncs windowLen slidingInterval
  16. Differential Computation: Pause-shift-resume and Online Rectification incorporated into an efficient

    Pregel-style computation implementation Effectively an extension of the Pregel iterative processing model for time-evolving graphs
  17. Operations: StreamingBSP GraphStream Apply Pregel iterationFunc until next snapshot is

    available T = 1 class GraphStream { def StreamingBSP(..., iterationFunc, ...): GraphStream } Combine previous results with new snaphot, continue iterating T = 2 T = 3 Continue until convergence
  18. PageRank using StreamingBSP PageRank computation on streaming graphs easily achieved

    by a simple call def pageRankEvolGraph(gs: GraphStream) = { def vprog(v: VertexId, msgSum: double) = 0.15+0.85*msgSum return gs.StreamingBSP(1, 100, EdgeDirection.Out, "10s") (vprog, triplet => triplet.src.pr/triplet.src.outDeg, (msgA, msgB) => msgA+msgB) } Listing 3: Page Rank Computation on Time-Evolving Graphs 4.4 Live Graph State Tracking Streaming graph applications may want to keep track of live graph state. For example, social network applications may keep track of Faster convergence than running PageRank from scratch on every snapshot
  19. Operations: updateLocalState class GraphStream { def updateLocalState (stateUpdateFunc, initialState): LocalStateStream

    } GraphStream T = 1 initialState T = 2 T = 3 stateUpdateFunc Keep updating non-graph "state" as graph evolves
  20. Implementation Implemented on Apache Spark platform - Spark Streaming: stream

    processing engine - GraphX: graph processing engine GraphTau implemented by combining Spark Streaming and Graphx - Novel optimizations to implement the GraphStream abstraction
  21. Other Benefits Spark Streaming, GraphX built on Spark's RDDs RDDs

    guarantees fault-tolerance and consistency of datasets In addition, allows mixing data and graph parallel computations in GraphStream
  22. Preliminary Results • Algorithms: – PageRank – Connected Components •

    Setup: 16 Amazon EC2 instances • Datasets: – Twitter follow graph: 41M vertices, ~1.5B edges – Live LTE network: 2M vertices, variable edges
  23. Preliminary Results: PageRank Dataset: Twitter Graph broken in to parts:

    - 1 part = full graph - 5 parts = 20% of graph in each part Comparison: - Time to complete PageRank in GraphX on full graph - Time to complete streaming PageRank in GraphTau when the graph is streamed in parts
  24. Preliminary Results: PageRank �� ���� ���� ���� ���� ����� ��

    �� �� �� ��� �������������������� ������������������������������������������ ������ ����� ������ ����� � GraphXon whole graph could not converge! GraphTau converged fast when 20% of the graph is streamed at a time Smaller batches lead to faster convergence
  25. Preliminary Results: Cell IQ CellIQ (NSDI 2015): Prior work -

    Detection of persistent hotspots using incremental connected components - Built specialized system to do temporal analysis Re-implemented on general system GraphTau - Uses mergeByWindow for sliding window analysis - Strawman (baseline) runs non-incremental connected components on whole window of snapshots
  26. Preliminary Results: Cell IQ 0 2 4 6 8 0

    2 4 6 8 10 12 Analysis Time (s) Window Size (m) Strawman GraphTau CellIQ GraphTau managed to get performance comparable to specialized system, without domain specific optimizations
  27. Takeways GraphTau General purpose processing engine for time-evolving graphs GraphStream

    abstraction that provides Consistent & fault-tolerant snapshot generation Co-ordinate snapshotting and computation Sliding window operations Mix data and graph parallel computations