Upgrade to Pro — share decks privately, control downloads, hide ads and more …

NoSQL Landscape: Speed, Scale, and JSON

NoSQL Landscape: Speed, Scale, and JSON

NoSQL use cases, survey of database options, Couchbase architecture. Also how to develop with JSON document databases and how to build Couchbase map reduce indexes.

Rap song not included.

Chris Anderson

October 06, 2012
Tweet

More Decks by Chris Anderson

Other Decks in Programming

Transcript

  1. • 2.2  Billion  internet  users • 50%  Americans  use  

    smartphones • Your  app  can  grow   overnight • Are  you  ready? 2 Growth  is  the  New  Reality Saturday, October 6, 12
  2. Instagrowth:  Android  Launch • Instagram  gained  nearly  1  million  users

     overnight  when  they   expanded  to  Android Example 3 Saturday, October 6, 12
  3. Draw  Something  -­‐  Social  Game 5 35 million monthly active

    users in 1 month about 5 Instagrams (Instagram today is waaaay more than 1 Instagram) Saturday, October 6, 12
  4. Goes  Viral  3  Weeks  aOer  Launch 6 19 17 15

    13 11 9 7 5 3 3/1 28 26 24 22 20 18 16 14 12 10 8 2/6 Draw  Something  by  OMGPOP Daily  Ac)ve  Users  (millions) 21 2 4 6 8 10 12 14 16 35+M  MAU at  peak Saturday, October 6, 12
  5. By  Contrast,  at  1/2  an  Instagram 7 The  Simpson’s:  Tapped

     Out Daily  Ac)ve  Users  (millions) Saturday, October 6, 12
  6. Scalable  Data  Layer 9 •On-­‐demand  cluster  sizing •Grow  or  shrink

     with  workload •Easy  node  provisioning •All  nodes  are  the  same •MulA-­‐master  Cross-­‐Datacenter  ReplicaAon •For  a  fast  and  reliable  user  experience  worldwide •EffecAve  Auto-­‐sharding •Should  avoid  cluster  hot  spots Saturday, October 6, 12
  7. Old  School  Hits  a  Scale  Wall 10 Application Scales Out

    Just add more commodity web servers Database Scales Up Get a bigger, more complex server Expensive & disruptive sharding, doesn’t perform at web scale Saturday, October 6, 12
  8. Tradi^onal  MySQL  +  Memcached  Architecture 11 • Run as many

    MySQL machines as you need • Data sharded evenly across the machines using client code • Memcached used to provide faster response time for users and reduce load on the database Memcached  Tier MySQL  Tier App  Servers www.example.com Saturday, October 6, 12
  9. Limita^ons  of  MySQL  +  Memcached 12 • To scale you

    need to start using MySQL more simply • Scale by hand • Replication / Sharding is a black art • Code overhead to manage keeping memcache and mysql in sync • Lots of components to deploy Learn  From  Others  -­‐  This  Scenario  Costs  Time  and  Money.  Scaling  SQL  is   poten^ally  disastrous  when  going  Viral:  very  risky  ^me  for  major  code   changes  and  migra^ons...  you  have  no  Time  when  skyrocke^ng  up. Saturday, October 6, 12
  10. NoSQL  Architectural  Promise 13 Couchbase  Database  Servers App  Servers www.example.com

    • High Performance data access • Scale Up/Down Horizontally • 24x7x365 Always-On Availability • Flexible Schema Document Model Saturday, October 6, 12
  11. 15 The Key-Value Store – the foundation of NoSQL Key

    101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 Opaque Binary Value Saturday, October 6, 12
  12. 16 Memcached – the NoSQL precursor Key 101100101000100010011101 101100101000100010011101 101100101000100010011101

    101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 Opaque Binary Value memcached In-­‐memory  only Limited  set  of  opera^ons Blob  Storage:  Set,  Add,  Replace,  CAS Retrieval:  Get Structured  Data:  Append,  Increment “Simple  and  fast.” Challenges:  cold  cache,  disrup^ve  elas^city Saturday, October 6, 12
  13. 17 Redis  –  More  “Structured  Data”  commands Key 101100101000100010011101 101100101000100010011101

    101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 “Data  Structures” Blob List Set Hash … redis In-­‐memory  only Vast  set  of  opera^ons Blob  Storage:  Set,  Add,  Replace,  CAS Retrieval:  Get,  Pub-­‐Sub Structured  Data:  Strings,  Hashes,  Lists,  Sets, Sorted  lists Example  opera7ons  for  a  Set Add,  count,  subtract  sets,  intersec^on,  is   member?,  atomic  move  from  one  set  to  another Saturday, October 6, 12
  14. 19 Membase  –  From  key-­‐value  cache  to  database Disk-­‐based  with

     built-­‐in  memcached  cache Cache  refill  on  restart Memcached  compa^ble  (drop  in  replacement) Highly-­‐available  (data  replica^on) Add  or  remove  capacity  to  live  cluster “Simple,  fast,  elas^c.” membase Key 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 101100101000100010011101 Opaque Binary Value Saturday, October 6, 12
  15. 20 NoSQL  catalog Key-­‐Value memcached membase redis Data  Structure Document

    Column Graph Cache (memory  only) Database (memory/disk) Saturday, October 6, 12
  16. 21 Couchbase  –  document-­‐oriented  database Key {      

     “string”  :  “string”,        “string”  :  value,        “string”  :                        {    “string”  :  “string”,                              “string”  :  value  },        “string”  :  [  array  ] } Auto-­‐sharding Disk-­‐based  with  built-­‐in  memcached  cache Cache  refill  on  restart Memcached  compa^ble  (drop  in  replace) Highly-­‐available  (data  replica^on) Add  or  remove  capacity  to  live  cluster When  values  are  JSON  objects  (“documents”): Create  indices,  views  and  query  against  the  views JSON OBJECT (“DOCUMENT”) Couchbase Saturday, October 6, 12
  17. 22 NoSQL  catalog Key-­‐Value memcached membase redis Data  Structure Document

    Column Graph Cache (memory  only) Database (memory/disk) membase couchbase Saturday, October 6, 12
  18. 23 MongoDB  –  Document-­‐oriented  database Key {      

     “string”  :  “string”,        “string”  :  value,        “string”  :                        {    “string”  :  “string”,                              “string”  :  value  },        “string”  :  [  array  ] } Disk-­‐based  with  in-­‐memory  “caching” BSON  (“binary  JSON”)  format  and  wire  protocol Master-­‐slave  replica^on Auto-­‐sharding Values  are  BSON  objects Supports  ad  hoc  queries  –  best  when  indexed BSON OBJECT (“DOCUMENT”) MongoDB Saturday, October 6, 12
  19. 24 NoSQL  catalog Key-­‐Value memcached membase redis Data  Structure Document

    Column Graph mongoDB couchbase Cache (memory  only) Database (memory/disk) Saturday, October 6, 12
  20. 25 Cassandra  –  Column  overlays Disk-­‐based  system Clustered   External

     caching  required  for  low-­‐latency  reads “Columns”  are  overlaid  on  the  data Not  all  rows  must  have  all  columns Supports  efficient  queries  on  columns Restart  required  when  adding  columns Good  cross-­‐datacenter  support Cassandra Column  1 Column  2 Column  3   (not  present)   Saturday, October 6, 12
  21. 26 NoSQL  catalog Key-­‐Value memcached membase redis Data  Structure Document

    Column Graph mongoDB couchbase cassandra Cache (memory  only) Database (memory/disk) Saturday, October 6, 12
  22. 27 Neo4j  –  Graph  database Disk-­‐based  system External  caching  required

     for  low-­‐latency  reads Nodes,  rela^onships  and  paths Proper^es  on  nodes Delete,  Insert,  Traverse,  etc. Neo4j Saturday, October 6, 12
  23. 28 NoSQL  catalog Key-­‐Value memcached membase redis Data  Structure Document

    Column Graph mongoDB couchbase cassandra Cache (memory  only) Database (memory/disk) Neo4j Saturday, October 6, 12
  24. The  Landscape 29 Speed Scale Couchbase Redis S3 Cassandra MongoDB

    Riak HBase CouchDB Neo4j SimpleDB memcached RDBMS Datomic Saturday, October 6, 12
  25. (Really)  High  Performance 34 Latency less than 1/2 ms Throughput

    grows linearly with cluster size 5 Nodes -- 1.75M operations per second Cisco and Solarflare benchmark of Couchbase Server Saturday, October 6, 12
  26. 38 Couchbase  Server  Basic  Opera^on COUCHBASE  CLIENT  LIBRARY §Docs  distributed

     evenly  across   servers  in  the  cluster §Each  server  stores  both  ac)ve  &   replica  docs § Only  one  server  ac^ve  at  a  ^me §Client  library  provides  app  with   simple  interface  to  database §Cluster  map  provides  map  to  which   server  doc  is  on § App  never  needs  to  know § App  reads,  writes,  updates  docs § Mul^ple  App  Servers  can  access  same   document  at  same  ^me Doc  2 Doc  5 SERVER  1 Doc  4 SERVER  2 Doc  1 SERVER  3 COUCHBASE  CLIENT  LIBRARY Doc  9 Doc  7 Doc  8 Doc  6 Doc  3 DOC DOC DOC DOC DOC DOC DOC DOC DOC Ac^ve  Docs Ac^ve  Docs Ac^ve  Docs CLUSTER  MAP CLUSTER  MAP APP  SERVER  1 APP  SERVER  2 COUCHBASE  SERVER  CLUSTER Saturday, October 6, 12
  27. 38 Couchbase  Server  Basic  Opera^on COUCHBASE  CLIENT  LIBRARY §Docs  distributed

     evenly  across   servers  in  the  cluster §Each  server  stores  both  ac)ve  &   replica  docs § Only  one  server  ac^ve  at  a  ^me §Client  library  provides  app  with   simple  interface  to  database §Cluster  map  provides  map  to  which   server  doc  is  on § App  never  needs  to  know § App  reads,  writes,  updates  docs § Mul^ple  App  Servers  can  access  same   document  at  same  ^me Doc  4 Doc  2 Doc  5 SERVER  1 Doc  6 Doc  4 SERVER  2 Doc  7 Doc  1 SERVER  3 Doc  3 COUCHBASE  CLIENT  LIBRARY Doc  9 Doc  7 Doc  8 Doc  6 Doc  3 DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC Doc  9 Doc  5 DOC DOC DOC Doc  1 Doc  8 Doc  2 Replica  Docs Replica  Docs Replica  Docs Ac^ve  Docs Ac^ve  Docs Ac^ve  Docs CLUSTER  MAP CLUSTER  MAP APP  SERVER  1 APP  SERVER  2 COUCHBASE  SERVER  CLUSTER Saturday, October 6, 12
  28. 38 Couchbase  Server  Basic  Opera^on COUCHBASE  CLIENT  LIBRARY §Docs  distributed

     evenly  across   servers  in  the  cluster §Each  server  stores  both  ac)ve  &   replica  docs § Only  one  server  ac^ve  at  a  ^me §Client  library  provides  app  with   simple  interface  to  database §Cluster  map  provides  map  to  which   server  doc  is  on § App  never  needs  to  know § App  reads,  writes,  updates  docs § Mul^ple  App  Servers  can  access  same   document  at  same  ^me Doc  4 Doc  2 Doc  5 SERVER  1 Doc  6 Doc  4 SERVER  2 Doc  7 Doc  1 SERVER  3 Doc  3 Read/Write/Update COUCHBASE  CLIENT  LIBRARY Read/Write/Update Doc  9 Doc  7 Doc  8 Doc  6 Doc  3 DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC Doc  9 Doc  5 DOC DOC DOC Doc  1 Doc  8 Doc  2 Replica  Docs Replica  Docs Replica  Docs Ac^ve  Docs Ac^ve  Docs Ac^ve  Docs CLUSTER  MAP CLUSTER  MAP APP  SERVER  1 APP  SERVER  2 COUCHBASE  SERVER  CLUSTER Saturday, October 6, 12
  29. 39 Add  Nodes  to  the  Cluster § Two  servers  added

     to  cluster § One-­‐click  opera^on § Docs  automa^cally   rebalanced  across  cluster § Even  distribu^on  of  docs § Minimum  doc  movement § Cluster  map  updated § App  database  calls  now   distributed  over  larger  #  of   servers Doc  7 Doc  9 Doc  3 Ac^ve  Docs Replica  Docs Doc  6 COUCHBASE  CLIENT  LIBRARY CLUSTER  MAP APP  SERVER  1 COUCHBASE  CLIENT  LIBRARY CLUSTER  MAP APP  SERVER  2 Doc  4 Doc  2 Doc  5 SERVER  1 Doc  6 Doc  4 SERVER  2 Doc  7 Doc  1 SERVER  3 Doc  3 Doc  9 Doc  7 Doc  8 Doc  6 Doc  3 DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC Doc  9 Doc  5 DOC DOC DOC Doc  1 Doc  8 Doc  2 Replica  Docs Replica  Docs Replica  Docs Ac^ve  Docs Ac^ve  Docs Ac^ve  Docs COUCHBASE  SERVER  CLUSTER Saturday, October 6, 12
  30. 39 Add  Nodes  to  the  Cluster § Two  servers  added

     to  cluster § One-­‐click  opera^on § Docs  automa^cally   rebalanced  across  cluster § Even  distribu^on  of  docs § Minimum  doc  movement § Cluster  map  updated § App  database  calls  now   distributed  over  larger  #  of   servers Doc  7 Doc  9 Doc  3 Ac^ve  Docs Replica  Docs Doc  6 COUCHBASE  CLIENT  LIBRARY CLUSTER  MAP APP  SERVER  1 COUCHBASE  CLIENT  LIBRARY CLUSTER  MAP APP  SERVER  2 Doc  4 Doc  2 Doc  5 SERVER  1 Doc  6 Doc  4 SERVER  2 Doc  7 Doc  1 SERVER  3 Doc  3 Doc  9 Doc  7 Doc  8 Doc  6 Doc  3 DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC Doc  9 Doc  5 DOC DOC DOC Doc  1 Doc  8 Doc  2 Replica  Docs Replica  Docs Replica  Docs Ac^ve  Docs Ac^ve  Docs Ac^ve  Docs SERVER  4 SERVER  5 Ac^ve  Docs Ac^ve  Docs Replica  Docs Replica  Docs COUCHBASE  SERVER  CLUSTER Saturday, October 6, 12
  31. 39 Add  Nodes  to  the  Cluster § Two  servers  added

     to  cluster § One-­‐click  opera^on § Docs  automa^cally   rebalanced  across  cluster § Even  distribu^on  of  docs § Minimum  doc  movement § Cluster  map  updated § App  database  calls  now   distributed  over  larger  #  of   servers Doc  7 Doc  9 Doc  3 Ac^ve  Docs Replica  Docs Doc  6 COUCHBASE  CLIENT  LIBRARY CLUSTER  MAP APP  SERVER  1 COUCHBASE  CLIENT  LIBRARY CLUSTER  MAP APP  SERVER  2 Doc  4 Doc  2 Doc  5 SERVER  1 Doc  6 Doc  4 SERVER  2 Doc  7 Doc  1 SERVER  3 Doc  3 Doc  9 Doc  7 Doc  8 Doc  6 Doc  3 DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC Doc  9 Doc  5 DOC DOC DOC Doc  1 Doc  8 Doc  2 Replica  Docs Replica  Docs Replica  Docs Ac^ve  Docs Ac^ve  Docs Ac^ve  Docs SERVER  4 SERVER  5 Ac^ve  Docs Ac^ve  Docs Replica  Docs Replica  Docs COUCHBASE  SERVER  CLUSTER Saturday, October 6, 12
  32. 39 Add  Nodes  to  the  Cluster § Two  servers  added

     to  cluster § One-­‐click  opera^on § Docs  automa^cally   rebalanced  across  cluster § Even  distribu^on  of  docs § Minimum  doc  movement § Cluster  map  updated § App  database  calls  now   distributed  over  larger  #  of   servers Read/Write/Update Read/Write/Update Doc  7 Doc  9 Doc  3 Ac^ve  Docs Replica  Docs Doc  6 COUCHBASE  CLIENT  LIBRARY CLUSTER  MAP APP  SERVER  1 COUCHBASE  CLIENT  LIBRARY CLUSTER  MAP APP  SERVER  2 Doc  4 Doc  2 Doc  5 SERVER  1 Doc  6 Doc  4 SERVER  2 Doc  7 Doc  1 SERVER  3 Doc  3 Doc  9 Doc  7 Doc  8 Doc  6 Doc  3 DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC Doc  9 Doc  5 DOC DOC DOC Doc  1 Doc  8 Doc  2 Replica  Docs Replica  Docs Replica  Docs Ac^ve  Docs Ac^ve  Docs Ac^ve  Docs SERVER  4 SERVER  5 Ac^ve  Docs Ac^ve  Docs Replica  Docs Replica  Docs COUCHBASE  SERVER  CLUSTER Saturday, October 6, 12
  33. 40 Fail  Over  Node COUCHBASE  CLIENT  LIBRARY CLUSTER  MAP COUCHBASE

     CLIENT  LIBRARY CLUSTER  MAP Doc  4 Doc  2 Doc  5 SERVER  1 Doc  6 Doc  4 SERVER  2 SERVER  3 Doc  3 Doc  9 Doc  7 Doc  8 Doc  6 DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC Doc  5 DOC DOC DOC Doc  1 Doc  8 Doc  2 Replica  Docs Replica  Docs Replica  Docs Ac^ve  Docs Ac^ve  Docs Ac^ve  Docs SERVER  4 SERVER  5 Ac^ve  Docs Ac^ve  Docs Replica  Docs Replica  Docs COUCHBASE  SERVER  CLUSTER § App  servers  happily  accessing  docs  on   Server  3 § Server  fails § App  server  requests  to  server  3  fail § Cluster  detects  server  has  failed § Promotes  replicas  of  docs  to  ac)ve § Updates  cluster  map § App  server  requests  for  docs  now  go  to   appropriate  server § Typically  rebalance  would  follow Doc  7 Doc  9 Doc  3 Ac^ve  Docs Replica  Docs Doc  6 Doc  7 Doc  1 Doc  3 DOC DOC Doc  9 DOC DOC Saturday, October 6, 12
  34. 40 Fail  Over  Node COUCHBASE  CLIENT  LIBRARY CLUSTER  MAP COUCHBASE

     CLIENT  LIBRARY CLUSTER  MAP Doc  4 Doc  2 Doc  5 SERVER  1 Doc  6 Doc  4 SERVER  2 SERVER  3 Doc  3 Doc  9 Doc  7 Doc  8 Doc  6 DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC Doc  5 DOC DOC DOC Doc  1 Doc  8 Doc  2 Replica  Docs Replica  Docs Replica  Docs Ac^ve  Docs Ac^ve  Docs Ac^ve  Docs SERVER  4 SERVER  5 Ac^ve  Docs Ac^ve  Docs Replica  Docs Replica  Docs COUCHBASE  SERVER  CLUSTER § App  servers  happily  accessing  docs  on   Server  3 § Server  fails § App  server  requests  to  server  3  fail § Cluster  detects  server  has  failed § Promotes  replicas  of  docs  to  ac)ve § Updates  cluster  map § App  server  requests  for  docs  now  go  to   appropriate  server § Typically  rebalance  would  follow Doc  7 Doc  9 Doc  3 Ac^ve  Docs Replica  Docs Doc  6 Doc  7 Doc  1 Doc  3 DOC DOC Doc  9 DOC DOC Saturday, October 6, 12
  35. 40 Fail  Over  Node COUCHBASE  CLIENT  LIBRARY CLUSTER  MAP COUCHBASE

     CLIENT  LIBRARY CLUSTER  MAP Doc  4 Doc  2 Doc  5 SERVER  1 Doc  6 Doc  4 SERVER  2 SERVER  3 Doc  3 Doc  9 Doc  7 Doc  8 Doc  6 DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC Doc  5 DOC DOC DOC Doc  1 Doc  8 Doc  2 Replica  Docs Replica  Docs Replica  Docs Ac^ve  Docs Ac^ve  Docs Ac^ve  Docs SERVER  4 SERVER  5 Ac^ve  Docs Ac^ve  Docs Replica  Docs Replica  Docs COUCHBASE  SERVER  CLUSTER § App  servers  happily  accessing  docs  on   Server  3 § Server  fails § App  server  requests  to  server  3  fail § Cluster  detects  server  has  failed § Promotes  replicas  of  docs  to  ac)ve § Updates  cluster  map § App  server  requests  for  docs  now  go  to   appropriate  server § Typically  rebalance  would  follow Doc  7 Doc  9 Doc  3 Ac^ve  Docs Replica  Docs Doc  6 Doc  7 Doc  1 Doc  3 DOC DOC Doc  9 DOC DOC Saturday, October 6, 12
  36. 40 Fail  Over  Node COUCHBASE  CLIENT  LIBRARY CLUSTER  MAP COUCHBASE

     CLIENT  LIBRARY CLUSTER  MAP Doc  4 Doc  2 Doc  5 SERVER  1 Doc  6 Doc  4 SERVER  2 SERVER  3 Doc  3 Doc  9 Doc  7 Doc  8 Doc  6 DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC Doc  5 DOC DOC DOC Doc  1 Doc  8 Doc  2 Replica  Docs Replica  Docs Replica  Docs Ac^ve  Docs Ac^ve  Docs Ac^ve  Docs SERVER  4 SERVER  5 Ac^ve  Docs Ac^ve  Docs Replica  Docs Replica  Docs COUCHBASE  SERVER  CLUSTER § App  servers  happily  accessing  docs  on   Server  3 § Server  fails § App  server  requests  to  server  3  fail § Cluster  detects  server  has  failed § Promotes  replicas  of  docs  to  ac)ve § Updates  cluster  map § App  server  requests  for  docs  now  go  to   appropriate  server § Typically  rebalance  would  follow Doc  7 Doc  9 Doc  3 Ac^ve  Docs Replica  Docs Doc  6 Doc  7 Doc  1 Doc  3 DOC DOC Doc  9 DOC DOC Saturday, October 6, 12
  37. 40 Fail  Over  Node COUCHBASE  CLIENT  LIBRARY CLUSTER  MAP COUCHBASE

     CLIENT  LIBRARY CLUSTER  MAP Doc  4 Doc  2 Doc  5 SERVER  1 Doc  6 Doc  4 SERVER  2 SERVER  3 Doc  3 Doc  9 Doc  7 Doc  8 Doc  6 DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC Doc  5 DOC DOC DOC Doc  1 Doc  8 Doc  2 Replica  Docs Replica  Docs Replica  Docs Ac^ve  Docs Ac^ve  Docs Ac^ve  Docs SERVER  4 SERVER  5 Ac^ve  Docs Ac^ve  Docs Replica  Docs Replica  Docs COUCHBASE  SERVER  CLUSTER § App  servers  happily  accessing  docs  on   Server  3 § Server  fails § App  server  requests  to  server  3  fail § Cluster  detects  server  has  failed § Promotes  replicas  of  docs  to  ac)ve § Updates  cluster  map § App  server  requests  for  docs  now  go  to   appropriate  server § Typically  rebalance  would  follow Saturday, October 6, 12
  38. 41 •Suddenly, disk writes all began to time out •Many

    services experienced outages: • FourSquare, Reddit, Quora, among others •With memory buffered writes, a scalable data layer keeps working • When EBS came back online, Couchbase wrote all the updated data to disk without missing a beat. War  Story:  EBS  Outage Saturday, October 6, 12
  39. 42 Cross  Data  Center  Replica^on §Data  close  to  users §Mul^ple

     loca^ons  for  disaster  recovery §Independently  managed  clusters  serving  local  data US  DATA  CENTER EUROPE  DATA  CENTER ASIA  DATA  CENTER Replica7on Replica7on Replica7on Saturday, October 6, 12
  40. 46 Document  Database This synergy between the programming model and

    the distribution model is very valuable. It allows the database to use its knowledge of how the application programmer clusters the data to help performance across the cluster. hrp://mar^nfowler.com/bliki/AggregateOrientedDatabase.html o::1001 { uid:  ji22jd, customer:  Ann, line_items:  [   {  sku:  0321293533,  quan:  3,    unit_price:  48.0  }, {  sku:  0321601912,  quan:  1,  unit_price:  39.0  }, {  sku:  0131495054,  quan:  1,  unit_price:  51.0  }   ], payment:  {  type:  Amex,  expiry:  04/2001,   last5:  12345  } } Saturday, October 6, 12
  41. 49 JSON  Documents • Maps  more  closely  to  external  API

    • CRUD  Opera^ons,  lightweight  schema • Stored  under  an  iden^fier  key { “fields” : [“with basic types”, 3.14159, true], “like” : “your favorite language” } client.set(“mydocumentid”, myDocument); mySavedDocument = client.get(“mydocumentid”); Saturday, October 6, 12
  42. Meta  +  Document  Body 50 {      "brewery":  "New

     Belgium  Brewing",      "name":  "1554  Enlightened  Black  Ale",      "abv":  5.5,      "descrip7on":  "Born  of  a  flood...",      "category":  "Belgian  and  French  Ale",      "style":  "Other  Belgian-­‐Style  Ales",      "updated":  "2010-­‐07-­‐22  20:00:20" } {      "id"  :  "beer_Enlightened_Black_Ale”,          ... { Document user data, can be anything unique ID Metadata identifier, expiration, etc “vintage” date format from an SQL dump >_< Saturday, October 6, 12
  43. {      "brewery":  "New  Belgium        "name":

     "1554  Enlightene      "abv":  5.5,      "descrip7on":  "Born  of  a  fl      "category":  "Belgian  and  F      "style":  "Other  Belgian-­‐Sty      "updated":  "2010-­‐07-­‐22  2    “ra7ngs”  :  {        “525”  :  5,        “30”  :  4,        “1044”  :  2  },    “comment_ids”  :  [          “f1e62”,          “6ad8c”      ] } Add  comments  to  the  beer {      "type":  "comment",      "about_id":  "beer_Enlightened_Black_Ale",      "user_id":  525,      "text":  "tastes  like  college!",      "updated":  "2010-­‐07-­‐22  20:00:20" } link to comments link to beer {      "id":  "f1e62" } Saturday, October 6, 12
  44. 52 How  to:  look  up  comments  from  a  beer •

    SERIALIZED  LOOP   figure  hrp://www.ibm.com/developerworks/webservices/library/ws-­‐sdoarch/ beer = client.get(“beer:A_cold_one”); beer.comment_ids.each { |id| comments.push(client.get(id)); } • FAST  MULTI-­‐KEY  LOOKUP beer = client.get(“beer:A_cold_one”); comments = client.multiGet(beer.comment_ids) • ASYNC  VIEW  QUERY comments = client.query(“myapp”,“by_comment_on”, {:key => “beer:A_cold_one”}); Saturday, October 6, 12
  45. 53 Emergent  Schema JSON.org Github  API Twiqer  API "Capture  the

     user's  intent" • The  database  can  handle  it • Your  app  controls  the  schema Saturday, October 6, 12
  46. 59 Find  documents  by  a  specific  arribute • Lets  find

     beers  by  brewery_id! Saturday, October 6, 12
  47. 63 Use  a  built-­‐in  reduce  func^on  with  a  group  query

    • Lets  find  average  abv  for  each  brewery! Saturday, October 6, 12
  48. group_level=3  -­‐  daily  results  -­‐  great  for  graphing 67 •

    Daily,  hourly,  minute  or  second  rollup  all  possible  with   the  same  index. • hrp://crate.im/posts/couchbase-­‐views-­‐reddit-­‐data/ Saturday, October 6, 12
  49. 6 GeoCouch  R-­‐Tree  Index • Op^mized  for  bulk  loading  of

     large  data  sets • Simplified  query  model  (bounding  box,  nearest  neighbor) Saturday, October 6, 12
  50. Elas^c  Search  Adapter 70 • Elas^c  Search  is  good  for

     ad-­‐hoc  queries  and  faceted  browsing • Our  adapter  is  aware  of  changing  Couchbase  topology • Indexed  by  Elas^c  Search  aOer  stored  to  disk  in  Couchbase Saturday, October 6, 12