Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Distributed Materialized Views

Distributed Materialized Views

Talk on Scylla's Materialized Views at the 2017 Scylla Summit

Duarte Nunes

October 24, 2017
Tweet

More Decks by Duarte Nunes

Other Decks in Programming

Transcript

  1. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company Distributed Materialized Views @duarte_nunes Duarte Nunes
  2. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company Duarte Nunes 2 Duarte Nunes is a Software Engineer working on ScyllaDB. He has a background in concurrent programming, distributed systems and low-latency software. Prior to ScyllaDB, he worked on distributed network virtualization.
  3. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company Agenda ▪ Introduction ▪ Problem Space ▪ Solution o Propagation o Consistency o Concurrent Updates o Building a MV ▪ Conclusions 3
  4. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company Introduction
  5. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company Materialized Views 5 ▪ A table containing a copy of the results of some query performed on a base table ▪ Updated when the underlying base table is modified ▪ Used as an index in Scylla ▪ Usually has a different PK than the base o Currently limited to including at most one base regular column o Must include the base’s PK
  6. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company Distributed Materialized Views 6 ▪ The view table is stored in different replicas than the base table ▪ No single master server to serialize updates ▪ View must be eventually consistent with the base table
  7. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company CQL 7 ▪ CREATE TABLE buildings ( name text, city text, built int, meters int, PRIMARY KEY (name) ); ▪ CREATE MATERIALIZED VIEW building_by_city AS SELECT * FROM buildings WHERE city IS NOT NULL PRIMARY KEY(city, name);
  8. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company Problem Space
  9. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company Constraints 9 ▪ Cluster availability mustn’t be affected o Don’t stop serving base table writes ▪ Base write latency overhead must be minimized ▪ An update to a base table must propagate to the view table o Rows in the base will eventually appear in the view o Rows absent from the base will eventually be absent from the view
  10. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company ▪ CREATE TABLE base_table ( p int, c int, v int, PRIMARY KEY (p, c) ); Read-before-write 10 ▪ CREATE MATERIALIZED VIEW view_table AS SELECT * FROM base_table WHERE v IS NOT NULL PRIMARY KEY(v, p, c);
  11. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company ▪ CREATE TABLE base_table ( p int, c int, v int, PRIMARY KEY (p, c) ); ▪ p | c | v ---+---+--- 0 | 1 | 8 Read-before-write 11 ▪ CREATE MATERIALIZED VIEW view_table AS SELECT * FROM base_table WHERE v IS NOT NULL PRIMARY KEY(v, p, c); ▪ v | p | c ---+---+--- 8 | 0 | 1
  12. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company ▪ CREATE TABLE base_table ( p int, c int, v int, PRIMARY KEY (p, c) ); ▪ p | c | v ---+---+--- 0 | 1 | 8 ▪ UPDATE TABLE base_table SET v = 10 WHERE p = 0 AND c = 1 Read-before-write 12 ▪ CREATE MATERIALIZED VIEW view_table AS SELECT * FROM base_table WHERE v IS NOT NULL PRIMARY KEY(v, p, c); ▪ v | p | c ---+---+--- 8 | 0 | 1
  13. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company ▪ CREATE TABLE base_table ( p int, c int, v int, PRIMARY KEY (p, c) ); ▪ p | c | v ---+---+--- 0 | 1 | 8 ▪ UPDATE TABLE base_table SET v = 10 WHERE p = 0 AND c = 1 Read-before-write 13 ▪ CREATE MATERIALIZED VIEW view_table AS SELECT * FROM base_table WHERE v IS NOT NULL PRIMARY KEY(v, p, c); ▪ v | p | c ---+---+--- 8 | 0 | 1 ▪ Insert (10, 0, 1)
  14. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company ▪ CREATE TABLE base_table ( p int, c int, v int, PRIMARY KEY (p, c) ); ▪ p | c | v ---+---+--- 0 | 1 | 8 ▪ UPDATE TABLE base_table SET v = 10 WHERE p = 0 AND c = 1 Read-before-write 14 ▪ CREATE MATERIALIZED VIEW view_table AS SELECT * FROM base_table WHERE v IS NOT NULL PRIMARY KEY(v, p, c); ▪ v | p | c ---+---+--- 8 | 0 | 1 ▪ Insert (10, 0, 1) ▪ Tombstone (8, 0, 1)
  15. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company ▪ CREATE TABLE base_table ( p int, c int, v1 int, v2 int, PRIMARY KEY (p, c) ); Read-before-write 15 ▪ CREATE MATERIALIZED VIEW view_table AS SELECT * FROM base_table WHERE v1 IS NOT NULL PRIMARY KEY(v1, p, c, v2);
  16. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company ▪ CREATE TABLE base_table ( p int, c int, v1 int, v2 int, PRIMARY KEY (p, c) ); ▪ UPDATE TABLE base_table SET v2 = 9 WHERE p = 0 AND c = 1 Read-before-write 16 ▪ CREATE MATERIALIZED VIEW view_table AS SELECT * FROM base_table WHERE v1 IS NOT NULL PRIMARY KEY(v1, p, c, v2);
  17. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company ▪ CREATE TABLE base_table ( p int, c int, v1 int, v2 int, PRIMARY KEY (p, c) ); ▪ UPDATE TABLE base_table SET v2 = 9 WHERE p = 0 AND c = 1 Read-before-write 17 ▪ CREATE MATERIALIZED VIEW view_table AS SELECT * FROM base_table WHERE v1 IS NOT NULL PRIMARY KEY(v1, p, c, v2); ▪ Update (8, 0, 1, 9)
  18. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company ▪ Concurrent updates don’t mix well with read-before-write o Two or more updates can read the same old value in the base o They would insert the same tombstone and miss each other’s updates o A view entry would have no corresponding base table entry Concurrent updates 18
  19. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company A solution must 19 ▪ Include a mechanism for update propagation ▪ Solve consistency issues between base and view tables ▪ Concurrent updates must be safe o An algorithm should work locally and also provide emergent safety, as synchronizing a whole cluster is practically unattainable.
  20. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company Solution
  21. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company Coordinator-based propagation 21 C B V1 w(p: 1, v: 10) V2
  22. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company Coordinator-based propagation 22 C B V1 w(p: 1, v: 10) d(v: 5) w(p: 10, p: 1) V2
  23. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company Coordinator-based propagation 23 C B V1 w(p: 1, v: 10) d(v: 5) w(p: 10, p: 1) V2 d(v: 5) w(v:10, p: 1)
  24. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company Transactional guarantees? ▪ More roundtrips o Prepare and commit stages ▪ Added latency ▪ View replica availability impacting the base replica’s 24
  25. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company Replica-based propagation 25 C B V1 w(p: 1, v: 10) V2 d(v: 5) w(v:10, p: 1)
  26. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company Synchronous replica-based propagation 26 C B V1 w(p: 1, v: 10) V2 d(v: 5) w(v:10, p: 1) r: Ok r: Ok
  27. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company Asynchronous replica-based propagation 27 C B V1 w(p: 1, v: 10) V2 d(v: 5) w(v:10, p: 1) r: Ok r: Ok
  28. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company ▪ Pair one base replica to one view replica per view partition o The view replica receiving the update may not be the one receiving a tombstone Asynchronous replica-based propagation 28
  29. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company ▪ Pair one base replica to one view replica per view partition o The view replica receiving the update may not be the one receiving a tombstone ▪ View updating load distributed across base replicas Asynchronous replica-based propagation 29
  30. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company ▪ Pair one base replica to one view replica per view partition o The view replica receiving the update may not be the one receiving a tombstone ▪ View updating load distributed across base replicas ▪ In the absence of failures, RF view replicas will be updated Asynchronous replica-based propagation 30
  31. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company ▪ Base and view tables can diverge Consistency 31
  32. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company ▪ Base and view tables can diverge ▪ A base table update may fail to be propagated o If the view that receives the updated row becomes unavailable, the base row won’t be indexed (or will be stale) Consistency 32
  33. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company ▪ Base and view tables can diverge ▪ A base table update may fail to be propagated o If the view that receives the updated row becomes unavailable, the base row won’t be indexed (or will be stale) ▪ A ghost record can remain in the view o If the base table becomes unavailable after updating the view table, the updates becomes a ghost row in the view o If the view that receives the tombstone becomes unavailable, that row becomes a ghost row in the view Consistency 33
  34. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company ▪ Remember view table updates in the base o Use commit or batch log o Solves first consistency issue o … or just delegate to repair Remembering view updates 34
  35. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company ▪ View read-repair o When querying a view, probabilistically check if the row exists in the base and whether it’s up to date Repair 35
  36. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company ▪ View read-repair o When querying a view, probabilistically check if the row exists in the base and whether it’s up to date ▪ View repair o Scan view table rows o Applies just between a base replica and a view replica, but is expected to work globally Repair 36
  37. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company ▪ View read-repair o When querying a view, probabilistically check if the row exists in the base and whether it’s up to date ▪ View repair o Scan view table rows o Applies just between a base replica and a view replica, but is expected to work globally ▪ Can’t repair views between themselves o Due to distributed concurrent updates Repair 37
  38. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company ▪ Optimistic o Gate on the memtable o Retry calculating view update if it is flushed or if a conflicting update happened Local concurrent updates 38
  39. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company ▪ Optimistic o Gate on the memtable o Retry calculating view update if it is flushed or if a conflicting update happened ▪ Locking o Row, range of rows or partition o Cheaper if sufficiently granular, due to Scylla’s TPC Local concurrent updates 39
  40. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company ▪ Applies to base rows created before a view ▪ Scan all of the base table partitions o Send updates to the paired view replicas o Keep progress in system tables View Building 40
  41. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company Conclusions
  42. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company ▪ Powerful global indexes ▪ Under development ▪ Under-researched problem o Lots of subtleties (see State of Materialized Views) ▪ Performance impact o Locking and read-before-write incur overhead o Impossible to avoid with Scylla’s data model ▪ Future: support for aggregation views Closing thoughts 42
  43. PRESENTATION TITLE ON ONE LINE AND ON TWO LINES First

    and last name Position, company THANK YOU [email protected] @duarte_nunes Please stay in touch Any questions?