Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Consistency in Zenoh, an edge data fabric

Sreeja S Nair
April 08, 2022
49

Consistency in Zenoh, an edge data fabric

Presented at RainbowFS Workshop, Paris on 28th March 2022

Sreeja S Nair

April 08, 2022
Tweet

Transcript

  1. Senior Technologis t ZettaScale Technolog y RainbowFS worksho p 28/03/2022

    Sreeja S. Nair, PhD Achieving Eventual Consistency in Zenoh
  2. Mission Bring to every connected human and machine the unconstrained

    freedom to communicate, compute and store — anywhere, at any scale, ef f i
  3. Compute Produce Distribute Store Pull Store Compute Consume Retrieve Eclipse

    Foundation, “From DevOps to EdgeOps: A Vision for Edge Computing“, 2021 Push Push Push Push Compute Push Push Push Pull Pull Push
  4. “Some people want it to happen, some wish it would

    happen, others make it happen.” – Michael Jordan 5 Uni s
  5. 7 Extensions Filesystem Memory … Storage plugin SSE HTML5 Server

    Sent Events V2X/R2X connectivity Administration & monitoring plugin plugin Pluggable backends
  6. 8 Topologies Clique Mesh Router Router Router Router Peer Peer

    Peer Peer Peer Peer Peer Peer Peer Client Client Client Client Brokered Routed Peer-to-pee d
  7. 9 Resource : A named data, in other terms a

    (key, value) Key expression : An expression identifying a set of key s Publisher : A spring of values for a key expression Subscriber : A sink of values for a key expression Queryable : A well of values for a key expression Abstractions /home/kitchen/sensor/temp, 21. 5 /home/bedroom/sensor/temp, 19 /home/kitchen/sensor/tem p /home/kitchen/**
  8. 10 zZ /louvre/**/temp /louvre/1/temp /louvre/2/temp /louvre/1/temp /louvre/2/temp /louvre/2/** /louvre/1/** Publisher

    Publisher Subscribe r Subscribe r Pul l Subscribe r Storage Storage Zenoh in Action
  9. 10 zZ /louvre/**/temp /louvre/1/temp /louvre/2/temp /louvre/1/temp /louvre/2/temp /louvre/2/** /louvre/1/** Publisher

    Publisher Subscribe r Subscribe r Pul l Subscribe r Storage Storage Zenoh in Action
  10. 11 zZ /louvre/**/temp /louvre/1/temp /louvre/2/temp /louvre/1/temp /louvre/2/temp /louvre/2/** /louvre/1/** Publisher

    Publisher Subscribe r Subscribe r Pul l Subscribe r Storage Storage Zenoh in Action
  11. 11 zZ /louvre/**/temp /louvre/1/temp /louvre/2/temp /louvre/1/temp /louvre/2/temp /louvre/2/** /louvre/1/** Publisher

    Publisher Subscribe r Subscribe r Pul l Subscribe r Storage Storage Zenoh in Action
  12. 12 zZ /louvre/**/temp /louvre/1/temp /louvre/2/temp /louvre/1/temp /louvre/2/temp /louvre/2/** /louvre/1/** Publisher

    Publisher Subscribe r Subscribe r Pul l Subscribe r Storage Storage Zenoh in Action
  13. 12 zZ /louvre/**/temp /louvre/1/temp /louvre/2/temp /louvre/1/temp /louvre/2/temp /louvre/2/** /louvre/1/** Publisher

    Publisher Subscribe r Subscribe r Pul l Subscribe r Storage Storage Zenoh in Action
  14. 13 zZ /louvre/**/temp /louvre/1/temp /louvre/2/temp /louvre/1/temp /louvre/2/temp /louvre/2/** /louvre/1/** get

    /louvre/*/temp Publisher Publisher Subscribe r Subscribe r Pul l Subscribe r Storage Storage Querier Zenoh in Action
  15. 13 zZ /louvre/**/temp /louvre/1/temp /louvre/2/temp /louvre/1/temp /louvre/2/temp /louvre/2/** /louvre/1/** get

    /louvre/*/temp Publisher Publisher Subscribe r Subscribe r Pul l Subscribe r Storage Storage Querier Zenoh in Action
  16. 14 zZ /louvre/**/temp /louvre/1/temp /louvre/2/temp /louvre/1/temp /louvre/2/temp /louvre/2/** /louvre/1/** get

    /louvre/*/temp Publisher Publisher Subscribe r Subscribe r Pul l Subscribe r Storage Storage Querier Zenoh in Action
  17. 14 zZ /louvre/**/temp /louvre/1/temp /louvre/2/temp /louvre/1/temp /louvre/2/temp /louvre/2/** /louvre/1/** get

    /louvre/*/temp Publisher Publisher Subscribe r Subscribe r Pul l Subscribe r Storage Storage Querier Zenoh in Action
  18. 15 High throughput (4M msg/s — +40Gb/s) n Ubuntu 20.0

    4 AMD Ryze n 32GB RA M 100Gbps ETH Pub Sub Host Host “One of the things I love about music is live performance.” - Yo-Yo Ma
  19. 16 Replication in Zenoh Peer Peer Peer Router Router Router

    Router Router Peer Peer Peer Peer Peer Peer Peer Router Client Client Client Client Client Client Client /louvre/** /louvre/** /louvre/** Publisher Publisher Publisher Publisher Publisher Storage Storage Storage Fault toleranc g
  20. 17 Replicated storages Abstract View /louvre/** /louvre/** /louvre/** Publisher Publisher

    Publisher Publisher Publisher Publisher Storage Storage Storage
  21. 18 Replicated storages Abstract View /louvre/** /louvre/** /louvre/** Publisher Publisher

    Publisher Publisher Publisher Publisher Storage Storage Storage /louvre/1/temp
  22. 18 Replicated storages Abstract View /louvre/** /louvre/** /louvre/** Publisher Publisher

    Publisher Publisher Publisher Publisher Storage Storage Storage /louvre/1/temp
  23. 20 - Network might have partitions Diverging scenario Peer Peer

    Peer Router Router Router Router Router Peer Peer Peer Peer Peer Peer Peer Router Client Client Client Client Client Client Client
  24. 20 - Network might have partitions Diverging scenario Peer Peer

    Peer Router Router Router Router Router Peer Peer Peer Peer Peer Peer Peer Router Client Client Client Client Client Client Client
  25. 23 Replication now /louvre/** /louvre/** /louvre/** get /louvre/floor1/temp Storage Storage

    Storage Querier (louvre/1/temp, 201:23) (louvre/1/temp, 201:23) (louvre/1/temp, 203:22)
  26. 23 Replication now /louvre/** /louvre/** /louvre/** get /louvre/floor1/temp Storage Storage

    Storage Querier (louvre/1/temp, 201:23) (louvre/1/temp, 201:23) (louvre/1/temp, 203:22)
  27. 24 Result Consolidation /louvre/** /louvre/** /louvre/** get /louvre/floor1/temp 201:23 201:23

    203:22 Storage Storage Storage Results from the storage are shown as timestamp: value Querier Consolidation modes : NONE, LAZY, FULL
  28. 24 Result Consolidation /louvre/** /louvre/** /louvre/** get /louvre/floor1/temp 201:23 201:23

    203:22 Storage Storage Storage Results from the storage are shown as timestamp: value Querier Consolidation modes : NONE, LAZY, FULL
  29. 24 Result Consolidation /louvre/** /louvre/** /louvre/** get /louvre/floor1/temp 203:22 Storage

    Storage Storage Results from the storage are shown as timestamp: value Querier Consolidation modes : NONE, LAZY, FULL
  30. 27 •Multiple publishers and storages for the same dat l

    All models are wrong, but some are useful - George E. P. Box
  31. 29 Anti-Entropy work fl ow Replica A Replica B get

    snapshot compute digest publish digest compare with local snapshot get missing info missing info update datastore At pre-de t
  32. 30 Properties for snapshots snapshot_time < current_time − propagation_delay −

    possible_skew possible_skew ≈ UHLC_MAX_DELTA_MS propagation_delay ≈ network_diameter intervalk = (ORIGIN + Δ × (k − 1), ORIGIN + Δ × k] snapshot_time = k * Δ
  33. 30 Properties for snapshots snapshot_time < current_time − propagation_delay −

    possible_skew possible_skew ≈ UHLC_MAX_DELTA_MS propagation_delay ≈ network_diameter intervalk = (ORIGIN + Δ × (k − 1), ORIGIN + Δ × k] Sample settings ;
  34. 31 • Use CRC hashes fi cient Hashing algorithm Run

    on Apple M1 processor with 8GB RAM
  35. 34 Composition of era Era … Interval 1 Interval n

    … Subinterval 1 Subinterval m … Timestamp 1 Timestamp t <Operation, Key, Value>
  36. 34 Composition of era Sub 11 Sub 1n … Interval

    1 TS 111 TS 1n1 … Interval m … … Era TS 11t TS 1nu … … Era … Interval 1 Interval n … Subinterval 1 Subinterval m … Timestamp 1 Timestamp t <Operation, Key, Value>
  37. 35 Compressing the digest Sub 11 Sub 1n … Interval

    1 Snip 11 Snip 1n … Sub m1 Sub mn … Interval m Snip m1 Snip mn … … Era COLD WARM HOT Sub 11 Sub 1n … Interval 1 Snip 11 Snip 1n … Sub m1 Sub mn … Interval m Snip m1 Snip mn … … Era Sub 11 Sub 1n … Interval 1 Snip 11 Snip 1n … Sub m1 Sub mn … Interval m Snip m1 Snip mn … … Era
  38. 35 Compressing the digest Sub 11 Sub 1n … Interval

    1 Snip 11 Snip 1n … Sub m1 Sub mn … Interval m Snip m1 Snip mn … … Era COLD WARM HOT Sub 11 Sub 1n … Interval 1 Snip 11 Snip 1n … Sub m1 Sub mn … Interval m Snip m1 Snip mn … … Era Sub 11 Sub 1n … Interval 1 Snip 11 Snip 1n … Sub m1 Sub mn … Interval m Snip m1 Snip mn … … Era
  39. 35 Compressing the digest Sub 11 Sub 1n … Interval

    1 Snip 11 Snip 1n … Sub m1 Sub mn … Interval m Snip m1 Snip mn … … Era COLD WARM HOT Sub 11 Sub 1n … Interval 1 Snip 11 Snip 1n … Sub m1 Sub mn … Interval m Snip m1 Snip mn … … Era Sub 11 Sub 1n … Interval 1 Snip 11 Snip 1n … Sub m1 Sub mn … Interval m Snip m1 Snip mn … … Era
  40. 35 Compressing the digest Sub 11 Sub 1n … Interval

    1 Snip 11 Snip 1n … Sub m1 Sub mn … Interval m Snip m1 Snip mn … … Era COLD WARM HOT Sub 11 Sub 1n … Interval 1 Snip 11 Snip 1n … Sub m1 Sub mn … Interval m Snip m1 Snip mn … … Era Sub 11 Sub 1n … Interval 1 Snip 11 Snip 1n … Sub m1 Sub mn … Interval m Snip m1 Snip mn … … Era
  41. Replica 36 Zenoh Replica Storage Digest publisher Digest subscriber Aligner

    Align Queryable Timestamp Log Digests Processed Digests
  42. Replica 36 Zenoh Replica Storage Digest publisher Digest subscriber Aligner

    Align Queryable Timestamp Log Digests Processed Digests
  43. Replica 36 Zenoh Replica Storage Digest publisher Digest subscriber Aligner

    Align Queryable Timestamp Log Digests Processed Digests
  44. 43 Digest for a time-series data store Sub 11 Sub

    1n … Interval 1 Snap 11 Snap 1n … Sub m1 Sub mn … Interval m Snap m1 Snap mn … … Hash COLD WARM HOT Compressing digest
  45. 43 Digest for a time-series data store Sub 11 Sub

    1n … Interval 1 Snap 11 Snap 1n … Sub m1 Sub mn … Interval m Snap m1 Snap mn … … Hash COLD WARM HOT Compressing digest
  46. 43 Digest for a time-series data store Sub 11 Sub

    1n … Interval 1 Snap 11 Snap 1n … Sub m1 Sub mn … Interval m Snap m1 Snap mn … … Hash COLD WARM HOT Compressing digest
  47. 43 Digest for a time-series data store Sub 11 Sub

    1n … Interval 1 Snap 11 Snap 1n … Sub m1 Sub mn … Interval m Snap m1 Snap mn … … Hash COLD WARM HOT Compressing digest Mega hash 1 Mega hash 2 Hash 1 Hash 2 Mega hash 3 Mega hash 4 Hash 3 Hash 4
  48. 43 Digest for a time-series data store Sub 11 Sub

    1n … Interval 1 Snap 11 Snap 1n … Sub m1 Sub mn … Interval m Snap m1 Snap mn … … Hash COLD WARM HOT Compressing digest Mega hash 1 Mega hash 2 Hash 1 Hash 2 Mega hash 3 Mega hash 4 Hash 3 Hash 4
  49. Key value COLD WARM HOT Hashing methodology Subinterval based Hash

    Structure Shallow tree with 3 levels Storage Entire tree stored Transmission Top level only First two levels Entire tree Misalignment detection 1 - era conten t 2 - interval conten t 3- subinterval content 1- interval content 2- subinterval content 1- request subinterval content Time series COLD WARM HOT Hashing methodology Heirarchical + incremental Subinterval based Hash Structure Linked list Shallow tree with 3 levels Storage Entire list stored Entire tree stored Transmission Head only First two levels Entire tree Misalignment detection N- list entr y n+1- mega conten t n+2- interval conten t n+3- subinterval 1- interval content 2- subinterval content 1- request subinterval content 44 Comparing digest creation