Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Declarative, Sliding Window Computations at the Edge

Declarative, Sliding Window Computations at the Edge

EdgeCom 2016

Christopher Meiklejohn

January 09, 2016
Tweet

More Decks by Christopher Meiklejohn

Other Decks in Programming

Transcript

  1. Declarative Sliding-Window Aggregations For Computations at the Edge Christopher Meiklejohn,

    Machine Zone, Inc. Peter Van Roy, Seyed H. Haeri (Hossein), Université catholique de Louvain EdgeCom 2016, January 9th, 2016 1
  2. What is Edge Computation? 2

  3. Edge Computation • Logical extremes of the network
 Applications, data,

    and computation 3
  4. Edge Computation • Logical extremes of the network
 Applications, data,

    and computation • Especially challenging where synchronization is hard 3
  5. Edge Computation • Logical extremes of the network
 Applications, data,

    and computation • Especially challenging where synchronization is hard • “Internet of Things”
 Low power, limited memory and connectivity 3
  6. Edge Computation • Logical extremes of the network
 Applications, data,

    and computation • Especially challenging where synchronization is hard • “Internet of Things”
 Low power, limited memory and connectivity • Mobile Applications
 Offline operation with replicated, shared state 3
  7. Edge Computation • Logical extremes of the network
 Applications, data,

    and computation • Especially challenging where synchronization is hard • “Internet of Things”
 Low power, limited memory and connectivity • Mobile Applications
 Offline operation with replicated, shared state • How should we manage events generated at the device? 3
  8. Traditional Approaches • Centralized computation (D-Streams, Storm, Summingbird)
 Stream all

    events to a centralized location for processing 4
  9. Traditional Approaches • Centralized computation (D-Streams, Storm, Summingbird)
 Stream all

    events to a centralized location for processing • Most general approach, however expensive
 Events must be buffered while devices are offline; power requirements for operating the antenna 4
  10. Traditional Approaches • Centralized computation (D-Streams, Storm, Summingbird)
 Stream all

    events to a centralized location for processing • Most general approach, however expensive
 Events must be buffered while devices are offline; power requirements for operating the antenna • Design a distributed algorithm (Directed/Digest Diffusion, TAG)
 Design an algorithm optimized for program dissemination and collection of results 4
  11. Traditional Approaches • Centralized computation (D-Streams, Storm, Summingbird)
 Stream all

    events to a centralized location for processing • Most general approach, however expensive
 Events must be buffered while devices are offline; power requirements for operating the antenna • Design a distributed algorithm (Directed/Digest Diffusion, TAG)
 Design an algorithm optimized for program dissemination and collection of results • Least general, however efficient
 Algorithm can be designed specifically to address unordered delivery, and optimized for minimal state transmission 4
  12. Can we design a general programming model for efficient distributed

    computation that can tolerate message delays, reordering, and duplication? 5
  13. Contributions • Extend previous work on “Lattice Processing” (Lasp)
 Declarative,

    functional programming model over distributed data structures (CRDTs) 6
  14. Contributions • Extend previous work on “Lattice Processing” (Lasp)
 Declarative,

    functional programming model over distributed data structures (CRDTs) • Extend our model with new data structures
 Two new data structures: Pair and Bounded-LWW-Set 6
  15. Contributions • Extend previous work on “Lattice Processing” (Lasp)
 Declarative,

    functional programming model over distributed data structures (CRDTs) • Extend our model with new data structures
 Two new data structures: Pair and Bounded-LWW-Set • Extend our model with dynamic scope
 “Dynamic” variables, where each node contains a unique value for a given variable which can be aggregated with a “dynamic” fold operation 6
  16. Background Conflict-Free 
 Replicated Data Types 7 SSS 2011

  17. Conflict-Free 
 Replicated Data Types • Collection of types
 Sets,

    counters, registers, flags, maps 8
  18. Conflict-Free 
 Replicated Data Types • Collection of types
 Sets,

    counters, registers, flags, maps • Strong Eventual Consistency (SEC)
 Objects that receive the same updates, regardless of order, will reach equivalent state 8
  19. RA RB RC

  20. RA RB RC {1} (1, {a}, {}) add(1)

  21. RA RB RC {1} (1, {a}, {}) add(1) {1} (1,

    {c}, {}) add(1)
  22. RA RB RC {1} (1, {a}, {}) add(1) {1} (1,

    {c}, {}) add(1) {} (1, {c}, {c}) remove(1)
  23. RA RB RC {1} (1, {a}, {}) add(1) {1} (1,

    {c}, {}) add(1) {} (1, {c}, {c}) remove(1) {1} {1} {1} (1, {a, c}, {c}) (1, {a, c}, {c}) (1, {a, c}, {c})
  24. Background Lattice Processing 14

  25. Lattice Processing (Lasp) • Distributed, deterministic dataflow
 Distributed, deterministic dataflow

    programming model for “eventually consistent” computations 15
  26. Lattice Processing (Lasp) • Distributed, deterministic dataflow
 Distributed, deterministic dataflow

    programming model for “eventually consistent” computations • Convergent data structures
 Primary data abstraction is the CRDT 15
  27. Lattice Processing (Lasp) • Distributed, deterministic dataflow
 Distributed, deterministic dataflow

    programming model for “eventually consistent” computations • Convergent data structures
 Primary data abstraction is the CRDT • Enables composition
 Provides functional composition of CRDTs that preserves the SEC property 15
  28. 16 %% Create initial set. S1 = declare(set), %% Add

    elements to initial set and update. update(S1, {add, [1,2,3]}), %% Create second set. S2 = declare(set), %% Apply map operation between S1 and S2. map(S1, fun(X) -> X * 2 end, S2).
  29. 17 %% Create initial set. S1 = declare(set), %% Add

    elements to initial set and update. update(S1, {add, [1,2,3]}), %% Create second set. S2 = declare(set), %% Apply map operation between S1 and S2. map(S1, fun(X) -> X * 2 end, S2).
  30. 18 %% Create initial set. S1 = declare(set), %% Add

    elements to initial set and update. update(S1, {add, [1,2,3]}), %% Create second set. S2 = declare(set), %% Apply map operation between S1 and S2. map(S1, fun(X) -> X * 2 end, S2).
  31. 19 %% Create initial set. S1 = declare(set), %% Add

    elements to initial set and update. update(S1, {add, [1,2,3]}), %% Create second set. S2 = declare(set), %% Apply map operation between S1 and S2. map(S1, fun(X) -> X * 2 end, S2).
  32. 20 %% Create initial set. S1 = declare(set), %% Add

    elements to initial set and update. update(S1, {add, [1,2,3]}), %% Create second set. S2 = declare(set), %% Apply map operation between S1 and S2. map(S1, fun(X) -> X * 2 end, S2).
  33. Lattice Processing (Lasp) • Functional and set-theoretic operations on sets


    Product, intersection, union, filter, map, fold 21
  34. Lattice Processing (Lasp) • Functional and set-theoretic operations on sets


    Product, intersection, union, filter, map, fold • Metadata computation
 Performs transformation on the internal metadata of CRDTs allowing creation of “composed” CRDTs 21
  35. Example Application Computing Averages 22

  36. Computing Aggregates • Sensors generate events
 Fold a rolling set

    of events into a local device average 23
  37. Computing Aggregates • Sensors generate events
 Fold a rolling set

    of events into a local device average • Merge local averages per device
 Fold local averages across devices into a global average replicated at each device 23
  38. Sensor1 Samples Local Avg Fold Global Avg SensorN Samples Local

    Avg Fold Lasp Operation Input User-Maintained CRDT Output Lasp-Maintained CRDT … Global Avg Fold Fold 24
  39. Sensor1 Samples Local Avg Fold Global Avg SensorN Samples Local

    Avg Fold Lasp Operation Input User-Maintained CRDT Output Lasp-Maintained CRDT … Global Avg Fold Fold 25
  40. Sensor1 Samples Local Avg Fold Global Avg SensorN Samples Local

    Avg Fold Lasp Operation Input User-Maintained CRDT Output Lasp-Maintained CRDT … Global Avg Fold Fold 26
  41. Sensor1 Samples Local Avg Fold Global Avg SensorN Samples Local

    Avg Fold Lasp Operation Input User-Maintained CRDT Output Lasp-Maintained CRDT … Global Avg Fold Fold 27
  42. Sensor1 Samples Local Avg Fold Global Avg SensorN Samples Local

    Avg Fold Lasp Operation Input User-Maintained CRDT Output Lasp-Maintained CRDT … Global Avg Fold Fold 28
  43. Sensor1 Samples Local Avg Fold Global Avg SensorN Samples Local

    Avg Fold Lasp Operation Input User-Maintained CRDT Output Lasp-Maintained CRDT … Global Avg Fold Fold 29
  44. 30 %% Define a pair of counters to store the

    global average. GlobalAverage = declare({counter, counter}, global_average), %% Declare a dynamic variable. Samples = declare_dynamic({bounded_lww_set, 100}), %% Define a local average; computed from the local Bounded-LWW set. LocalAverage = declare_dynamic({counter, counter}), %% Register an event handler with the sensor that is triggered each %% time an event X is triggered at a given timestamp T. EventHandler = fun({X, T} -> update(Samples, {add, x, t}, Actor) end register_event_handler(EventHandler), %% Fold samples using the function `avg' into a local average. fold(Samples, fun avg/2, LocalAverage) %% Fold local average using the function `avg' into a global average. fold_dynamic(LocalAverage, fun sum_pairs/2, GlobalAverage)
  45. 31 %% Define a pair of counters to store the

    global average. GlobalAverage = declare({counter, counter}, global_average), %% Declare a dynamic variable. Samples = declare_dynamic({bounded_lww_set, 100}), %% Define a local average; computed from the local Bounded-LWW set. LocalAverage = declare_dynamic({counter, counter}), %% Register an event handler with the sensor that is triggered each %% time an event X is triggered at a given timestamp T. EventHandler = fun({X, T} -> update(Samples, {add, x, t}, Actor) end register_event_handler(EventHandler), %% Fold samples using the function `avg' into a local average. fold(Samples, fun avg/2, LocalAverage), %% Fold local average using the function `avg' into a global average. fold_dynamic(LocalAverage, fun sum_pairs/2, GlobalAverage)
  46. 32 %% Define a pair of counters to store the

    global average. GlobalAverage = declare({counter, counter}, global_average), %% Declare a dynamic variable. Samples = declare_dynamic({bounded_lww_set, 100}), %% Define a local average; computed from the local Bounded-LWW set. LocalAverage = declare_dynamic({counter, counter}), %% Register an event handler with the sensor that is triggered each %% time an event X is triggered at a given timestamp T. EventHandler = fun({X, T} -> update(Samples, {add, x, t}, Actor) end register_event_handler(EventHandler), %% Fold samples using the function `avg' into a local average. fold(Samples, fun avg/2, LocalAverage), %% Fold local average using the function `avg' into a global average. fold_dynamic(LocalAverage, fun sum_pairs/2, GlobalAverage)
  47. 33 %% Define a pair of counters to store the

    global average. GlobalAverage = declare({counter, counter}, global_average), %% Declare a dynamic variable. Samples = declare_dynamic({bounded_lww_set, 100}), %% Define a local average; computed from the local Bounded-LWW set. LocalAverage = declare_dynamic({counter, counter}), %% Register an event handler with the sensor that is triggered each %% time an event X is triggered at a given timestamp T. EventHandler = fun({X, T} -> update(Samples, {add, x, t}, Actor) end register_event_handler(EventHandler), %% Fold samples using the function `avg' into a local average. fold(Samples, fun avg/2, LocalAverage), %% Fold local average using the function `avg' into a global average. fold_dynamic(LocalAverage, fun sum_pairs/2, GlobalAverage)
  48. 34 %% Define a pair of counters to store the

    global average. GlobalAverage = declare({counter, counter}, global_average), %% Declare a dynamic variable. Samples = declare_dynamic({bounded_lww_set, 100}), %% Define a local average; computed from the local Bounded-LWW set. LocalAverage = declare_dynamic({counter, counter}), %% Register an event handler with the sensor that is triggered each %% time an event X is triggered at a given timestamp T. EventHandler = fun({X, T} -> update(Samples, {add, x, t}, Actor) end register_event_handler(EventHandler), %% Fold samples using the function `avg' into a local average. fold(Samples, fun avg/2, LocalAverage), %% Fold local average using the function `avg' into a global average. fold_dynamic(LocalAverage, fun sum_pairs/2, GlobalAverage)
  49. 35 %% Define a pair of counters to store the

    global average. GlobalAverage = declare({counter, counter}, global_average), %% Declare a dynamic variable. Samples = declare_dynamic({bounded_lww_set, 100}), %% Define a local average; computed from the local Bounded-LWW set. LocalAverage = declare_dynamic({counter, counter}), %% Register an event handler with the sensor that is triggered each %% time an event X is triggered at a given timestamp T. EventHandler = fun({X, T} -> update(Samples, {add, x, t}, Actor) end register_event_handler(EventHandler), %% Fold samples using the function `avg' into a local average. fold(Samples, fun avg/2, LocalAverage), %% Fold local average using the function `avg' into a global average. fold_dynamic(LocalAverage, fun sum_pairs/2, GlobalAverage)
  50. 36 %% Define a pair of counters to store the

    global average. GlobalAverage = declare({counter, counter}, global_average), %% Declare a dynamic variable. Samples = declare_dynamic({bounded_lww_set, 100}), %% Define a local average; computed from the local Bounded-LWW set. LocalAverage = declare_dynamic({counter, counter}), %% Register an event handler with the sensor that is triggered each %% time an event X is triggered at a given timestamp T. EventHandler = fun({X, T} -> update(Samples, {add, x, t}, Actor) end register_event_handler(EventHandler), %% Fold samples using the function `avg' into a local average. fold(Samples, fun avg/2, LocalAverage), %% Fold local average using the function `avg' into a global average. fold_dynamic(LocalAverage, fun sum_pairs/2, GlobalAverage)
  51. Lasp Extensions Semantics 37

  52. Bounded LWW Set • Bounded “Last-Writer-Wins” Set
 Same structure as

    Observed-Remove Set, but enforces a maximum number of elements 38
  53. Bounded LWW Set • Bounded “Last-Writer-Wins” Set
 Same structure as

    Observed-Remove Set, but enforces a maximum number of elements • Objects tagged with local time
 Each object in the set is tagged with a local time from insertion time 38
  54. Bounded LWW Set • Bounded “Last-Writer-Wins” Set
 Same structure as

    Observed-Remove Set, but enforces a maximum number of elements • Objects tagged with local time
 Each object in the set is tagged with a local time from insertion time • Objects marked “removed” when bound exceeded
 Use a tombstone to mark objects as removed when performing insertions and merges 38
  55. Bounded LWW Set • Bounded “Last-Writer-Wins” Set
 Same structure as

    Observed-Remove Set, but enforces a maximum number of elements • Objects tagged with local time
 Each object in the set is tagged with a local time from insertion time • Objects marked “removed” when bound exceeded
 Use a tombstone to mark objects as removed when performing insertions and merges • “Last-Writer-Wins” with Single Writer
 Single writer removes possible non-determinism inherent with LWW- registers with replicated state 38
  56. Samples (bound 2) Event Handler 39

  57. Samples (bound 2) Event Handler {1} {(1, T1, F)} 1

    40
  58. Samples (bound 2) Event Handler {1} {(1, T1, F)} 1

    {1, 2} 2 {(1, T1, F), (2, T2, F)} 41
  59. Samples (bound 2) Event Handler {1} {(1, T1, F)} 1

    {1, 2} 2 {(1, T1, F), (2, T2, F)} {2, 3} {(1, T1, T), (2, T2, F), (3, T3, F)} 3 {2, 3} 42
  60. Lasp Fold • Fold
 Computes an aggregate over a set

    into another type of CRDT 43
  61. Lasp Fold • Fold
 Computes an aggregate over a set

    into another type of CRDT • Function invariants
 Operation must be associative, commutative, and have an inverse operation 43
  62. Samples (bound 2) Local Avg 44

  63. Samples (bound 2) Local Avg {1} {(1, T1, F)} {1,

    1} {(T1, 1)}, {} {(T1, 1)}, {} 45
  64. Samples (bound 2) Local Avg {1} {(1, T1, F)} {1,

    1} {(T1, 1)}, {} {(T1, 1)}, {} {1, 2} {(1, T1, F), (2, T2, F)} {(T1, 1}, {T2, 2}}, {} {(T1, 1}, {T2, 1}}, {} {3, 2} 46
  65. Samples (bound 2) Local Avg {1} {(1, T1, F)} {1,

    1} {(T1, 1)}, {} {(T1, 1)}, {} {1, 2} {(1, T1, F), (2, T2, F)} {(T1, 1}, {T2, 2}}, {} {(T1, 1}, {T2, 1}}, {} {3, 2} {2, 3} {(1, T1, T), (2, T2, F), (3, T3, F)} {2, 3} {(T1, 1}, {T2, 2}, (T3, 3)}, {(T1, 1}} {(T1, 1}, {T2, 1}, (T3, 1)}, {(T1, 1)} {5, 2} {5, 2} 47
  66. Dynamic Scope • Variables exist on all nodes
 Variables exist

    across all nodes with a given identifier with their own value that is not replicated 48
  67. Dynamic Scope • Variables exist on all nodes
 Variables exist

    across all nodes with a given identifier with their own value that is not replicated • Dynamic Fold operation
 Combines the value using a merge across all nodes through pairwise synchronization until fixed point is reached 48
  68. Global Avg Local Avg Fold 1.) “bind” message received 2.)

    fold operation re-executed with new value 3.) “bind” broadcast to peers 49
  69. Global Avg Local Avg Fold 1.) “bind” message received 2.)

    fold operation re-executed with new value 3.) “bind” broadcast to peers 50
  70. Global Avg Local Avg Fold 1.) “bind” message received 2.)

    fold operation re-executed with new value 3.) “bind” broadcast to peers 51
  71. Global Avg Local Avg Fold 1.) “bind” message received 2.)

    fold operation re-executed with new value 3.) “bind” broadcast to peers 52
  72. In Summary 53

  73. Related Work • Directed / Digest Diffusion
 Energy efficient aggregation

    and program dissemination with optimizations when aggregations are monotonic; however, not tolerant to some network anomalies 54
  74. Related Work • Directed / Digest Diffusion
 Energy efficient aggregation

    and program dissemination with optimizations when aggregations are monotonic; however, not tolerant to some network anomalies • Tiny AGgregation
 Declarative method for data collection across sensors using SQL-like syntax; however, not a general programming model 54
  75. Related Work • Directed / Digest Diffusion
 Energy efficient aggregation

    and program dissemination with optimizations when aggregations are monotonic; however, not tolerant to some network anomalies • Tiny AGgregation
 Declarative method for data collection across sensors using SQL-like syntax; however, not a general programming model • PVARS
 Similar to *Lisp parallel variables, where each processor had it’s own value and could apply an operation across nodes 54
  76. Future Work 55

  77. Future Work • Quantitative evaluation
 Evaluation and optimization of the

    prototype implementation in Erlang 55
  78. Future Work • Quantitative evaluation
 Evaluation and optimization of the

    prototype implementation in Erlang • Optimizations to reduce metadata
 Apply known CRDT optimizations to both the fold operation and data structures to reduce space complexity 55
  79. Thanks! 56 Christopher Meiklejohn @cmeik