CRDTs - The science behind Phoenix Presence

CRDTs - The science behind Phoenix Presence

373dd7c51433dc3c38436dcfdec79cdc?s=128

Maciej Kaszubowski

May 25, 2017
Tweet

Transcript

  1. The Problem

  2. None
  3. Server Node 1 Server Node 2

  4. Node A Node B [User] [User] [User] [] [] [User]

    User connects User disconnects
  5. There's no global time

  6. Node A Node B [User] [User] [User] [] [] [User]

    User connects User disconnects
  7. Node A Node B [User] [User] [User] [] [] [User]

    User connects User disconnects
  8. Node A Node B

  9. P.track(self, "users", "U1", %{})

  10. P.track(self, "users", "U1", %{}) P.list("users") %{"U1"  %{metas: [%{phx_ref: "…"}]}}

  11. P.track(self, "users", "U1", %{}) P.list("users") P.track(self, "users", "U2", %{}) %{"U1"

     %{metas: [%{phx_ref: "…"}]}}
  12. P.track(self, "users", "U1", %{}) P.list("users") P.track(self, "users", "U2", %{}) P.list("users")

    %{"U1"  %{metas: [%{phx_ref: "…"}]}, "U2"  %{metas: [%{phx_ref: "…"}]}} %{"U1"  %{metas: [%{phx_ref: "…"}]}}
  13. P.track(self, "users", "U1", %{}) P.list("users") P.track(self, "users", "U2", %{}) P.list("users")

    %{"U1"  %{metas: [%{phx_ref: "…"}]}, "U2"  %{metas: [%{phx_ref: "…"}]}} P.track(self, "users", "U1", %{}) P.untrack(self, "users", "U1") %{"U1"  %{metas: [%{phx_ref: "…"}]}}
  14. P.track(self, "users", "U1", %{}) P.list("users") P.track(self, "users", "U2", %{}) P.list("users")

    %{"U1"  %{metas: [%{phx_ref: "…"}]}, "U2"  %{metas: [%{phx_ref: "…"}]}} P.track(self, "users", "U1", %{}) P.untrack(self, "users", "U1") P.list("users") %{"U1"  %{metas: [%{phx_ref: "…"}]}, "U2"  %{metas: [%{phx_ref: "…"}]}} P.list("users") %{"U1"  %{metas: [%{phx_ref: "…"}]}, "U2"  %{metas: [%{phx_ref: "…"}]}} %{"U1"  %{metas: [%{phx_ref: "…"}]}}
  15. CRDTs The science behind Phoenix Presence Maciej Kaszubowski

  16. Conflict-free Replicated Data Type

  17. Alternatives

  18. • Single source of truth (DB) • Consensus algorithm •

    Resolving conflicts manually
  19. Why CRDTs?

  20. Eventually consistent Highly available Easy to use

  21. Eventually consistent Highly available Easy to use Hard to create

    :(
  22. Features

  23. 1. Commutative 2. Associative 3. Idempotent x y = y

    x (x y) z = x (y z) x x = x
  24. Server Node 1 Server Node 2 Server Node 3

  25. Client Client Client Client Client Client Client Client Client Server

  26. Examples

  27. Counters

  28. Node A Node B +5 5 8 -2 3 +5

    User connects User disconnects 0 0 5 3 8
  29. Node A Node B +5 5 8 -2 8 +5

    User connects User disconnects 0 0 5 3 13 10
  30. G-Counter Grow-only counter

  31. Node 2: 3 Value=8 Node 1: 5 Node 3: 1

    Merge
  32. Node 2: 3 Value=9 Node 1: 6 Node 3: 1

    +1 Merge
  33. PN-Counter Positive-Negative Counter

  34. Node 2: P=2 N=2 Value=7 Node 1: P=5 N=2 Node

    3: P=4 N=0 3 0 4 Merge
  35. Node 2: P=2 N=2 Value=6 Node 1: P=5 N=3 Node

    3: P=4 N=0 2 0 4 +1 Merge
  36. Node 2: P=2 N=2 Value=8 Node 1: P=5 N=3 Node

    3: P=6 N=0 2 0 6 +2 +1 Merge
  37. Sets

  38. Node A Node B [User] [User] [User] [] [] [User]

    User connects User disconnects
  39. G-Set Grow-only set

  40. Node 2: [1,2] Value=[1,2,3,4] Node 1: [1] Node 3: [3,4]

    Merge
  41. Node 2: [1,2] Value=[1,2,3,4,5] Node 1: [1,5] Node 3: [3,4]

    Merge
  42. 2P-Set Two-phase set

  43. Node 2: [1,2],[] Value=[1,2,3,4] Node 1: [1],[] Node 3: [3,4],[]

    Merge
  44. Node 2: [1,2],[] Value=[1,2,3,4] Node 1: [1],[] Node 3: [3,4],[]

    Merge G-Set for adds
  45. Node 2: [1,2],[] Value=[1,2,4] Node 1: [1],[] Node 3: [3,4],[3]

    Merge G-Set for removals
  46. Elements cannot be re-added

  47. Removes win

  48. Node A Node B [ ] Add Remove [1] [

    ] [1]
  49. Node A Node B [ ] [1] Add Remove [1]

    [1] [ ] [1] [ ] [1]
  50. Node A Node B [ ] [1] Add Remove [1]

    [1] [ ] [1] [ ] [1]
  51. Node A Node B [ ] [1] [1] Add Remove

    [1] [1] [1] [ ] [1] [1] [1] [ ] [1]
  52. OR-Set Observed-remove set

  53. Node A Node B [ ] Add Remove [{A,1}] [

    ] [{A,1}]
  54. Node A Node B [ ] [ ] Add Remove

    [{A,1}] [{A,1}] [ ] [ ] [{A,1}] [{A,1}, ]
  55. Node A Node B [ ] [ 1 ] Add

    Remove [{A,1}] [{A,1}] [ ] [ ] [{A,1}] [{A,1},{A,2}]
  56. Node A Node B [ ] [ 1 ] Add

    Remove [{A,1}] [{A,1}] [ ] [ ] [{A,1}] [{A,1},{A,2}]
  57. Node A Node B [ ] [ 1 ] Add

    Remove [{A,1}] [{A,1}] [ ] [ ] [{A,1}] [{A,1},{A,2}] [ 1 ] [{A,1},{A,2}] [{A,1},{A,2}] [ 1 ]
  58. Node A Node B [ ] [ 1 ] Add

    Remove [{A,1}] [{A,1}] [ ] [ ] [{A,1}] [{A,1},{A,2}] [ 1 ] [{A,1},{A,2}] [{A,1},{A,2}] [ 1 ] [A] [A]
  59. Add 1000 Remove 1000 Add Remove …

  60. ORSWOT Observed-remove set without tombstones

  61. None
  62. None
  63. Use Cases

  64. None
  65. None
  66. None
  67. None
  68. • Load balancing / routing • Mobile clients synchronisation •

    Temporary data on the servers • Avoiding work duplication • Collaborative editing
  69. None
  70. lasp-lang.readme.io

  71. Summary

  72. You're (almost) always designing a distributed system

  73. Think about failures

  74. Choose the correct tool for the job

  75. References • https://medium.com/@istanbul_techie/a-look-at-conflict-free-replicated-data- types-crdt-221a5f629e7e • http://basho.com/posts/technical/distributed-data-types-riak-2-0/ • http://highscalability.com/blog/2014/10/13/how-league-of-legends-scaled- chat-to-70-million-players-it-t.html •

    https://hal.inria.fr/inria-00609399v1/document • https://developers.soundcloud.com/blog/roshi-a-crdt-system-for- timestamped-events
  76. Thanks!