Save 37% off PRO during our Black Friday Sale! »

Scaling Data — Architectures for caching, joins and search

Scaling Data — Architectures for caching, joins and search

While building Rapportive, we learnt a few things about dealing with data — especially data in large volumes, with complex structure, that is constantly changing. And we've developed some patterns for thinking about data and processing in ways that keeps us sane.

This presentation explores the idea of deriving secondary, read-optimised datasets from a primary, raw, write-optimised dataset. That idea is the common pattern behind caches, database indexes and full-text indexes.

The content is related to my blog post on precomputed caches and data dependencies (http://martin.kleppmann.com/2012/10/01/rethinking-caching-in-web-apps.html), though approaching the topic from a different angle.

Talk given at Wonga, London, 11 January 2013.

0d4ef9af8e4f0cf5c162b48ba24faea6?s=128

Martin Kleppmann

January 11, 2013
Tweet

Transcript

  1. Martin Kleppmann • martin@kleppmann.com • http://martinkl.com

  2. http://www.flickr.com/photos/2fs/330318471/

  3. None
  4. http://www.flickr.com/photos/nasamarshall/4619659804 Don’t let Architecture Astronauts scare you – Spolsky

  5. 1.  Caching 2.  Joins 3.  Full-text search

  6. None
  7. None
  8. None
  9. None
  10. None
  11. function getDocument(id) { var doc = cache.get(id); if (doc) {

    return doc; } else { doc = db.get(id); cache.set(id, doc, cacheTTL); return doc; } }
  12. function updateDocument(id, doc) { cache.set(id, doc, cacheTTL); // or: cache.invalidate(id);

    db.set(id, doc); }
  13.   Consistency   Staleness   Aborted transactions   Race conditions

      Bootstrapping   Cold start after failure   New feature → new access pattern
  14. None
  15. CREATE TABLE users ( id integer PRIMARY KEY, name varchar(100)

    NOT NULL ); CREATE TABLE comments ( id integer PRIMARY KEY, in_reply_to integer REFERENCES comments (id), author_id integer REFERENCES users (id), content varchar(10000) );
  16. SELECT * FROM comments WHERE in_reply_to = ?; SELECT *

    FROM comments JOIN users ON author_id = users.id WHERE in_reply_to = ?; users (id, name) comments (id, in_reply_to, author_id, content)
  17.   blackdragonflower: May good fortune smile upon you foe your

    kindness.   Genmaken: Foe real! ▪  N0T_REALLY_RELEVANT: Almost Relevant   CanadianDave: for* ▪  Saucefire: Damn it, Dave, we don't have time foe yoe proper spelling. ▪ rosyatrandom: Ain't nobody got time foe that   beermethestrength: *dat
  18. id in_reply_to author_id content 1 NULL 42 May good fortune

    smile upon… 2 1 43 Foe real! 3 2 44 Almost Relevant
  19. -- Hacker News / Reddit clone, Postgres syntax WITH RECURSIVE

    tree(id, in_reply_to, author_id, content) AS ( SELECT id, in_reply_to, author_id, content FROM comments WHERE id = ? UNION ALL SELECT child.id, child.in_reply_to, child.author_id, child.content FROM tree parent, comments child WHERE parent.id = child.in_reply_to ) SELECT * FROM tree;
  20. id in_reply_to author_id content 1 NULL 42 May good fortune

    smile upon… 2 1 43 Foe real! CREATE INDEX ON comments (in_reply_to); -- Implicit in PRIMARY KEY CREATE UNIQUE INDEX ON comments (id);
  21. id in_reply_to author_id content 2 1 43 Foe real! CREATE

    INDEX ON comments (in_reply_to); SELECT * FROM comments WHERE in_reply_to = 1;
  22. 64 32 96 23 42 73 99

  23. 64 32 96 23 42 73 99 42 < 64

    42 > 32 “To find value 42, look at rows stored in disk blocks 12345 and 23456”
  24.   Consistency   Staleness   Aborted transactions   Race conditions

      Bootstrapping   Cold start after failure   New feature → new access pattern
  25.   Consistency   Staleness   Aborted transactions   Race conditions

      Bootstrapping   Cold start after failure   New feature → new access pattern RDBMS deals with all of these!
  26. Raw data Caches DB Indexes derive & maintain derive &

    maintain
  27. Raw data Caches DB Indexes derive & maintain derive &

    maintain with (lots of) code with 1 line of SQL
  28. -- I already have 200 million rows. -- This may

    take a few hours… CREATE INDEX CONCURRENTLY ON comments (in_reply_to);
  29. -- I already have 200 million rows. -- This may

    take a few hours… CREATE INDEX CONCURRENTLY ON comments (in_reply_to); OMG! No table lock!
  30. time

  31. time write write

  32. snapshot of table data CREATE INDEX CONCURRENTLY… time write write

  33. snapshot of table data CREATE INDEX CONCURRENTLY… time write write

    write write
  34. snapshot of table data CREATE INDEX CONCURRENTLY… time write write

    write write writes write
  35. snapshot of table data CREATE INDEX CONCURRENTLY… time write write

    write write write write writes write
  36. CREATE INDEX CONCURRENTLY ON comments (in_reply_to);

  37. None
  38. enter_sandman → “Say your prayers, little one / don’t forget,

    my son, to include everyone” term occurrences dont enter_sandman:5 everyon enter_sandman:11 forget enter_sandman:6 includ enter_sandman:10 littl enter_sandman:3 my enter_sandman:7 on enter_sandman:4 prayer enter_sandman:2 Tokenization / Stemming / Indexing / …
  39.   Natural language (tokenization, stemming, synonyms)   Relevance ranking  

    Search facets   Spelling correction & suggestions → Warrants being separate from database
  40. None
  41. None
  42.   Consistency   Staleness   Aborted transactions   Race conditions

      Bootstrapping   Cold start after failure   New feature → new access pattern
  43. Shirshanka Das, Chavdar Botev, Kapil Surlaker, et al.: “All Aboard

    the Databus!,” at ACM Symposium on Cloud Computing, 2012. http://www.socc2012.org/s18-das.pdf
  44. None
  45. Caches DB Indexes derive & maintain Search Indexes Raw data

  46. snapshot of table data CREATE INDEX CONCURRENTLY… time write write

    write write write write writes write
  47. snapshot of table data CREATE INDEX CONCURRENTLY… time write write

    write write write write writes write snapshot of table data
  48. snapshot of table data CREATE INDEX CONCURRENTLY… time write write

    write write write write writes write snapshot of table data
  49. snapshot of table data CREATE INDEX CONCURRENTLY… time write write

    write write write write writes write
  50. snapshot of table data CREATE INDEX CONCURRENTLY… time write write

    write write write write writes write
  51. snapshot of table data CREATE INDEX CONCURRENTLY… time write write

    write write write write writes write
  52.   Derived data can always be re-generated from raw data

      Derived data optimized for reading," raw data optimized for writing (normalized)   Many different datasets derived from one raw dataset   Update latency depends on application (ACID/online < ‘nearline’ < offline)
  53. None
  54.   Consistency   Staleness   Aborted transactions   Race conditions

      Bootstrapping   Cold start after failure   New feature → new access pattern
  55. None
  56.   Consistency   Staleness   Aborted transactions   Race conditions

      Bootstrapping   Cold start after failure   New feature → new access pattern Just build a new one!
  57.   Consistency   Staleness   Aborted transactions   Race conditions

      Bootstrapping   Cold start after failure   New feature → new access pattern Just build a new one! Build a new one regularly!
  58.   Consistency   Staleness   Aborted transactions   Race conditions

      Bootstrapping   Cold start after failure   New feature → new access pattern Just build a new one! Build a new one regularly! Fast stream process
  59. None
  60. Data to write All data append-only Batch process Precomputed view

    (cache) Message queue Stream process Real-time diff to view Reader
  61. Data to write All data append-only Batch process Precomputed view

    (cache) Message queue Stream process Real-time diff to view Reader e.g. HDFS e.g. Hadoop MapReduce e.g. Voldemort e.g. Kafka e.g. Storm e.g. Memcached
  62.   No such thing as a cache miss   After

    every batch rebuild, discard stream output → “human fault tolerance”   Not feasible if O(n2) [e.g. friends in common] or unbounded [e.g. search]   How does stream process know which keys to update (when using joins)?
  63. None
  64. None
  65. None