Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Causal Consistency For Large Neo4j Clusters by ...

Causal Consistency For Large Neo4j Clusters by Jim Webber at Big Data Spain 2017

An overview of the Raft algorithm and how Neo4j uses it to provide strong consistency at scale.

https://www.bigdataspain.org/2017/talk/causal-consistency-for-large-neo4j-clusters

Big Data Spain 2017
November 16th - 17th Kinépolis Madrid

Big Data Spain

November 29, 2017
Tweet

More Decks by Big Data Spain

Other Decks in Technology

Transcript

  1. Register Login You need to login in to continue your

    purchase! Username: Password: Create Account
  2. Register Login You need to login in to continue your

    purchase! Username: jim_w Password: ******** Create Account
  3. Register Login You need to login in to continue your

    purchase! Username: Password: Login
  4. Writing to the Core Cluster Neo4j Driver CREATE ( :

    User {. . . }) ✓ Neo4j Cluster
  5. Writing to the Core Cluster Neo4j Driver CREATE ( :

    User {. . . }) ✓ Neo4j Cluster
  6. Writing to the Core Cluster Neo4j Driver CREATE ( :

    User {. . . }) ✓ ✓ ✓ Neo4j Cluster
  7. Writing to the Core Cluster Neo4j Driver CREATE ( :

    User {. . . }) ✓ ✓ ✓ Neo4j Cluster
  8. Writing to the Core Cluster Neo4j Driver CREATE ( :

    User {. . . }) ✓ ✓ ✓ Neo4j Cluster
  9. Raft in a Nutshell • Raft keeps logs tied together

    • Logs contain entries for both the data and cluster membership • Entries are appended and subsequently committed if a simple majority agree • Implication: majority agree with the log as proposed • Anyone can call an election: highest term (logical clock) wins, followed by highest committed, followed by highest appended • Appended but uncommitted entries can be truncated, but this is safe (transaction aborted)
  10. Consensus Log → Committed Transactions → Updated Graph 0 1

    2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 12 13 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 Transaction log: the same transactions appear in the same order on all members Consensus log: stores both committed and uncommitted transactions Uncommitt ed entries may differ between members Transactions are only appended to the transaction log when committed according to Raft Transactions are applied, updating the graph Neo4j Raft implementation
  11. • For massive query throughput • Read-only replicas • Not

    involved in Consensus Commit • Disposable, suitable for auto-scaling Read Replicas
  12. Java < dependency> < gr oupI d> org. neo4j .

    dri ver< / gr oupI d> < art i f act I d> neo4j -j ava-dri ver< / art i f act I d> < / dependency> Python pi p i nst al l neo4j -dri ver .NET PM > I nst al l - Package N eo4j . D ri ver JavaScript npm i nst al l neo4j -dri ver
  13. bolt+routing:// G raphD at abase. dri ver( "bol t +

    r out i ng: / / aCor eServer" )
  14. G raphD at abase. dri ver( "bol t + r

    out i ng: / / aCor eServer" ) Bootstrap: specify any core server to route load across the bolt+routing://
  15. Routed write statements dri ver = G raphD at abase.

    dri ver( "bol t + r out i ng: / / aCor eServer" ) ; t ry ( Sessi on sessi on = dri ver. sessi on( AccessM ode. W RI TE ) ) { t ry ( Transact i on t x = sessi on. begi nTransact i on( ) ) { t x. run( "M ERG E ( user : User {userI d: {userI d}}) ", param et ers( "userI d", userI d ) ) ; t x. success( ) ; } }
  16. Routed read queries dri ver = G raphD at abase.

    dri ver( "bol t + r out i ng: / / aCor eServer" ) ; t ry ( Sessi on sessi on = dri ver. sessi on( AccessM ode. READ ) ) { t ry ( Transact i on t x = sessi on. begi nTransact i on( ) ) { t x. run( "M ATCH ( user : User {userI d: {userI d}}) - [ *] - ( : Pr oduct ) RETURN *", param et ers( "userI d", userI d ) ) ; t x. success( ) ; } }
  17. Cluster members slightly “ahead” or “behind” of each other 0

    1 2 3 4 5 6 7 8 9 10 1 1 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 If I query this server I won’t see the updates from transaction . If I query this server, I’ll see all updates from all committed transactions 1 1 1 1 This is normal behaviour
  18. Register Login You need to login in to continue your

    purchase! Username: Password: Create Account
  19. Register Login You need to login in to continue your

    purchase! Username: jim_w Password: ******** Create Account
  20. Register Login You need to login in to continue your

    purchase! Username: Password: Login
  21. 0 1 2 3 4 5 6 7 8 9

    10 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 Create Account App Serve r A Drive r
  22. 0 1 2 3 4 5 6 7 8 9

    10 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 CREATE ( : User ) Create Account App Serve r A Drive r
  23. 0 1 2 3 4 5 6 7 8 9

    10 11 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 CREATE ( : User ) Create Account App Serve r A Drive r
  24. 0 1 2 3 4 5 6 7 8 9

    10 11 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 CREATE ( : User ) Create Account App Serve r A Drive r 11
  25. 0 1 2 3 4 5 6 7 8 9

    10 11 CREATE ( : User ) Create Account App Serve r A Drive r 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 11
  26. 0 1 2 3 4 5 6 7 8 9

    10 11 CREATE ( : User ) Create Account App Serve r A Drive r 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 11
  27. 0 1 2 3 4 5 6 7 8 9

    10 11 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 CREATE ( : User ) Create Account App Serve r A Drive r 11
  28. 0 1 2 3 4 5 6 7 8 9

    10 11 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 CREATE ( : User ) Create Account App Serve r A Drive r M ATCH ( : User) Login App Serve r B Drive r 11
  29. 0 1 2 3 4 5 6 7 8 9

    10 11 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 CREATE ( : User ) Create Account App Serve r A Drive r M ATCH ( : User) Login App Serve r B Drive r 11
  30. Bookmark Session token String (for portability) Opaque to application Represents

    ultimate user’s most recent view of the graph More capabilities to come
  31. 0 1 2 3 4 5 6 7 8 9

    10 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 Create Account App Serve r A Drive r
  32. 0 1 2 3 4 5 6 7 8 9

    10 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 CREATE ( : User ) Create Account App Serve r A Drive r
  33. 0 1 2 3 4 5 6 7 8 9

    10 11 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 CREATE ( : User ) Create Account App Serve r A Drive r
  34. 0 1 2 3 4 5 6 7 8 9

    10 11 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 CREATE ( : User ) Create Account App Serve r A Drive r 11
  35. 0 1 2 3 4 5 6 7 8 9

    10 11 CREATE ( : User ) Create Account App Serve r A Drive r 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 11
  36. 0 1 2 3 4 5 6 7 8 9

    10 11 CREATE ( : User ) Create Account App Serve r A Drive r 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 11
  37. 0 1 2 3 4 5 6 7 8 9

    10 11 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 CREATE ( : User ) Create Account App Serve r A Drive r 11
  38. 0 1 2 3 4 5 6 7 8 9

    10 11 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 CREATE ( : User ) Create Account App Serve r A Drive r M ATCH ( : User) Login App Serve r B Drive r 11
  39. 0 1 2 3 4 5 6 7 8 9

    10 11 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 CREATE ( : User ) Create Account M ATCH ( : User) Login App Serve r A App Serve r B Drive r Drive r
  40. 0 1 2 3 4 5 6 7 8 9

    10 11 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 CREATE ( : User ) Create Account M ATCH ( : User) Login App Serve r A App Serve r B Drive r Drive r 11
  41. 0 1 2 3 4 5 6 7 8 9

    10 11 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 CREATE ( : User ) Create Account M ATCH ( : User) Login App Serve r A App Serve r B Drive r Drive r 11
  42. Obtain bookmark t ry ( Sessi on sessi on =

    dri ver. sessi on( AccessM ode. W RI TE ) ) { t ry ( Transact i on t x = sessi on. begi nTransact i on( ) ) { t x. run( "CREATE ( user : User {userI d: {userI d}, passw or dHash: {passw or dHash}) ", param et ers( "userI d", userI d, "passw or dH ash", passw or dH ash ) ); t x. success( ) ; } St ri ng bookm ark = sessi on. l ast Bookm ark( ) ; }
  43. 0 1 2 3 4 5 6 7 8 9

    10 11 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 CREATE ( : User ) Create Account M ATCH ( : User) Login App Serve r A App Serve r B Drive r Drive r 11 Obtain bookmark
  44. Use a bookmark t ry ( Sessi on sessi on

    = dri ver. sessi on( AccessM ode. READ ) ) { t ry ( Transact i on t x = sessi on. begi nTransact i on( bookm ark ) ) { t x. run( "M ATCH ( user : User {userI d: {userI d}}) RETURN *", param et ers( "userI d", userI d ) ) ; t x. success( ) ; } }
  45. 0 1 2 3 4 5 6 7 8 9

    10 11 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 CREATE ( : User ) Create Account M ATCH ( : User) Login App Serve r A App Serve r B Drive r Drive r 11 Use bookmark