Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Deep dive into near Real time Map-Reduce with V...

Deep dive into near Real time Map-Reduce with Views

Presentation on internal architecture of Couchbase Map-Reduce view indexing engine.

Avatar for Sarath Lakshman

Sarath Lakshman

October 07, 2014
Tweet

More Decks by Sarath Lakshman

Other Decks in Programming

Transcript

  1. §  Overview of incremental Map/Reduce §  Architecture of Couchbase views

    §  Database Change Protocol (DCP) for Views §  Index on-disk storage lower layer rewrites §  Faster index updates and indexing latency §  Read Your Own Writes (RYOW) for view queries Overview ©2014 Couchbase, Inc. 2
  2. §  In Couchbase, Map-Reduce is specifically used to create Indexes

    §  Map functions are applied to JSON documents and their output or "emit" data is stored in an index What are Map/Reduce views ? ©2014 Couchbase, Inc. 4
  3. Sample view for players of tetris with level > 100

    Map function ©2014 Couchbase, Inc. 5 function  (doc,  meta)  {    if  (doc.game  ==  “tetris” &&  doc.level  >  100)  {      emit([doc.level,  meta.id],  doc.score);    }   }  
  4. Finding maximum score   function  (keys,  values,  rereduce)  {  

           var  maxscore  =  0;          for  (var  i  =  0;  i  <  values.length;  i++)  {                  if  (values[i]  >  maxscore)  {                          maxscore  =  values[i];                  }          }          return  maxscore;   }   Reduce function ©2014 Couchbase, Inc.
  5. Incremental reduction ©2014 Couchbase, Inc. 7 7 20 60 60

    [105,u1], 10 [110,u2], 20 [105,u3], 50 [101,u4], 60 reduce re-reduce emitted values
  6. Incremental reduction ©2014 Couchbase, Inc. 8 8 200 60 200

    [105,a], 200 [110,b], 20 [105,c], 50 [101,d], 60 reduce re-reduce emitted values
  7. Design documents ©2014 Couchbase, Inc. 9 Couchbase Bucket Design Document

    2 View View View Design Document 1 View View Indexers Are Allocated Per Design Doc Can Only Access Data in the Bucket Namespace All Updated at Same Time
  8. Copy On Write (COW) – Append only b+ tree ©2014

    Couchbase, Inc. 15 KP1 KV1 KV2 KV3 KP2 Root KV1 KV2 KV3 KP1 KP2 Root Key-Value and Key-Pointer nodes Append-Only file
  9. Copy On Write (COW) – Append only b+ tree ©2014

    Couchbase, Inc. 16 KP1 KV1 KV2 KV3 KP2 Root KV1 KV2 KV3 KP1 KP2 Root Root# KP2# KV3# KV3# KP2# Root# Append-Only file B-tree state after an update on KV3
  10. View engine index maintenance ©2014 Couchbase, Inc. 17 Loader Writer

    Mapper Index update pipeline JSON Document 1 •  View index update is performed every five seconds when there are at-least 5000 doc changes •  Update pipeline is invoked for each design document
  11. View engine index maintenance ©2014 Couchbase, Inc. 18 Loader Writer

    Mapper Index update pipeline KV1 KV2 emitted values JSON Document 2 •  View index update is performed every five seconds when there are at-least 5000 doc changes •  Update pipeline is invoked for each design document
  12. View engine index maintenance ©2014 Couchbase, Inc. 19 Batcher Sorter

    WAL On-Disk Btree build/ update Index Writer Bottom-up btree build Bulk update Initial build Incremental Update
  13. View queries ©2014 Couchbase, Inc. 20 Node 1 Node 2

    Node 3 Query Request Couchbase Cluster
  14. View queries ©2014 Couchbase, Inc. 21 Node 1 Node 2

    Node 3 Query Coord Query Request Scatter Requests Couchbase Cluster
  15. View queries ©2014 Couchbase, Inc. 22 Node 1 Node 2

    Node 3 Query Coord Query Request Query Results K-way stream merger queue Scatter Requests Gather results Couchbase Cluster
  16. Index fragmentation and compaction ©2014 Couchbase, Inc. 23 View-1 btree

    View-2 btree View-1 btree View-2 btree Compactor Old index file Compacted index file Index-file.1 Index-file.2
  17. Index fragmentation and compaction ©2014 Couchbase, Inc. 24 View-1 btree

    View-2 btree View-1 btree View-2 btree Apply deltas Updater WAL Old index file Compacted index file Index-file.1 Index-file.2
  18. §  Document updates are delivered to view engine in near

    real time §  Disk reads are not required from document storage engine §  Faster index creation §  Lower indexing latencies §  Ability for view engine to rollback during node failures without full index rebuild Database Change Protocol and Views
  19. §  Lock contention and bottlenecks in Erlang VM §  View

    engine needs to be faster in order to consume and process database changes through DCP §  Rewritten index builder, index updater and index compactor §  Future work will improve rebalance duration with views Rewrite of index engine in storage layer ©2014 Couchbase, Inc. 27
  20. Benchmark on indexing latency ©2014 Couchbase, Inc. 28 Time taken

    (ms) for an updated document to get indexed in a view index 4 nodes, 1 bucket, 20M docs of size 2KB, 250 mutations/sec 0 10000 20000 30000 40000 Couchbase 2.5.1 Couchbase 3.0 34916 597
  21. §  Stale = ok §  Least query latency §  Returns

    the query results from index storage §  Index is by default incrementally updated every 5 seconds §  Stale = update_after §  Similar to stale=ok §  Forces indexer to perform index update immediately as part of query ignoring index update interval §  Stale = false §  Higher query latency §  Triggers indexer update and wait for index to be updated at-least up to the current document store §  Query results are returned only after index is updated Query options for staleness / consistency ©2014 Couchbase, Inc. 30
  22. §  Performing RYOW in Couchbase 2.5 §  Stale=false only ensured

    that persisted documents are available in the view query results §  User should ensure a document has persisted by using Observe command before querying §  RYOW using Couchbase 3.0 §  Stale=false ensures that query results are at-least up-to-date till the point in time of request §  How view engine ensures at-least point-in-time consistency §  Document dataset is partitioned to smaller set across the cluster §  Each partition has a sequence number that incremented on every update operation on a document §  View engine notes down the current sequence numbers for all the partitions during a stale=false query and waits until index is updated at-least up to those sequence numbers Read Your Own Writes (RYOW) ©2014 Couchbase, Inc. 31