$30 off During Our Annual Pro Sale. View Details »

NoSQL Landscape: Speed, Scale, and JSON

NoSQL Landscape: Speed, Scale, and JSON

NoSQL use cases, survey of database options, Couchbase architecture. Also how to develop with JSON document databases and how to build Couchbase map reduce indexes.

Rap song not included.

Chris Anderson

October 06, 2012
Tweet

More Decks by Chris Anderson

Other Decks in Programming

Transcript

  1. @jchris
    Chris  Anderson
    NoSQL  Landscape
    &  Grid  Compu7ng
    1
    Saturday, October 6, 12

    View Slide

  2. • 2.2  Billion  internet  users
    • 50%  Americans  use  
    smartphones
    • Your  app  can  grow  
    overnight
    • Are  you  ready?
    2
    Growth  is  the  New  Reality
    Saturday, October 6, 12

    View Slide

  3. Instagrowth:  Android  Launch
    • Instagram  gained  nearly  1  million  users  overnight  when  they  
    expanded  to  Android
    Example
    3
    Saturday, October 6, 12

    View Slide

  4. 1  Instagram
    =
    7.5M  MAU*
    4
    Instagrowth:  Android  Launch
    Example
    Saturday, October 6, 12

    View Slide

  5. Draw  Something  -­‐  Social  Game
    5
    35 million monthly active users in 1 month
    about 5 Instagrams
    (Instagram today is waaaay more than 1 Instagram)
    Saturday, October 6, 12

    View Slide

  6. Goes  Viral  3  Weeks  aOer  Launch
    6
    19
    17
    15
    13
    11
    9
    7
    5
    3
    3/1
    28
    26
    24
    22
    20
    18
    16
    14
    12
    10
    8
    2/6
    Draw  Something  by  OMGPOP
    Daily  Ac)ve  Users  (millions)
    21
    2
    4
    6
    8
    10
    12
    14
    16
    35+M  MAU
    at  peak
    Saturday, October 6, 12

    View Slide

  7. By  Contrast,  at  1/2  an  Instagram
    7
    The  Simpson’s:  Tapped  Out
    Daily  Ac)ve  Users  (millions)
    Saturday, October 6, 12

    View Slide

  8. GETTING  IT  RIGHT
    8
    Saturday, October 6, 12

    View Slide

  9. Scalable  Data  Layer
    9
    ●On-­‐demand  cluster  sizing
    ●Grow  or  shrink  with  workload
    ●Easy  node  provisioning
    ●All  nodes  are  the  same
    ●MulA-­‐master  Cross-­‐Datacenter  ReplicaAon
    ●For  a  fast  and  reliable  user  experience  worldwide
    ●EffecAve  Auto-­‐sharding
    ●Should  avoid  cluster  hot  spots
    Saturday, October 6, 12

    View Slide

  10. Old  School  Hits  a  Scale  Wall
    10
    Application Scales Out
    Just add more commodity web servers
    Database Scales Up
    Get a bigger, more complex server
    Expensive & disruptive sharding, doesn’t perform at web scale
    Saturday, October 6, 12

    View Slide

  11. Tradi^onal  MySQL  +  Memcached  Architecture
    11
    ● Run as many MySQL machines as
    you need
    ● Data sharded evenly across the
    machines using client code
    ● Memcached used to provide
    faster response time for users
    and reduce load on the database Memcached  Tier
    MySQL  Tier
    App  Servers
    www.example.com
    Saturday, October 6, 12

    View Slide

  12. Limita^ons  of  MySQL  +  Memcached
    12
    ● To scale you need to start using
    MySQL more simply
    ● Scale by hand
    ● Replication / Sharding is a black art
    ● Code overhead to manage keeping
    memcache and mysql in sync
    ● Lots of components to deploy
    Learn  From  Others  -­‐  This  Scenario  Costs  Time  and  Money.  Scaling  SQL  is  
    poten^ally  disastrous  when  going  Viral:  very  risky  ^me  for  major  code  
    changes  and  migra^ons...  you  have  no  Time  when  skyrocke^ng  up.
    Saturday, October 6, 12

    View Slide

  13. NoSQL  Architectural  Promise
    13
    Couchbase  Database  Servers
    App  Servers
    www.example.com
    • High Performance data access
    • Scale Up/Down Horizontally
    • 24x7x365 Always-On Availability
    • Flexible Schema Document
    Model
    Saturday, October 6, 12

    View Slide

  14. 14
    NOSQL  TAXONOMY
    Saturday, October 6, 12

    View Slide

  15. 15
    The Key-Value Store – the foundation of NoSQL
    Key
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    Opaque
    Binary
    Value
    Saturday, October 6, 12

    View Slide

  16. 16
    Memcached – the NoSQL precursor
    Key
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    Opaque
    Binary
    Value
    memcached
    In-­‐memory  only
    Limited  set  of  opera^ons
    Blob  Storage:  Set,  Add,  Replace,  CAS
    Retrieval:  Get
    Structured  Data:  Append,  Increment
    “Simple  and  fast.”
    Challenges:  cold  cache,  disrup^ve  elas^city
    Saturday, October 6, 12

    View Slide

  17. 17
    Redis  –  More  “Structured  Data”  commands
    Key
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    “Data  Structures”
    Blob
    List
    Set
    Hash

    redis
    In-­‐memory  only
    Vast  set  of  opera^ons
    Blob  Storage:  Set,  Add,  Replace,  CAS
    Retrieval:  Get,  Pub-­‐Sub
    Structured  Data:  Strings,  Hashes,  Lists,  Sets,
    Sorted  lists
    Example  opera7ons  for  a  Set
    Add,  count,  subtract  sets,  intersec^on,  is  
    member?,  atomic  move  from  one  set  to  another
    Saturday, October 6, 12

    View Slide

  18. 18
    NoSQL  catalog
    Key-­‐Value
    memcached redis
    Data  Structure Document Column Graph
    Cache
    (memory  only)
    Saturday, October 6, 12

    View Slide

  19. 19
    Membase  –  From  key-­‐value  cache  to  database
    Disk-­‐based  with  built-­‐in  memcached  cache
    Cache  refill  on  restart
    Memcached  compa^ble  (drop  in  replacement)
    Highly-­‐available  (data  replica^on)
    Add  or  remove  capacity  to  live  cluster
    “Simple,  fast,  elas^c.”
    membase
    Key
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    101100101000100010011101
    Opaque
    Binary
    Value
    Saturday, October 6, 12

    View Slide

  20. 20
    NoSQL  catalog
    Key-­‐Value
    memcached
    membase
    redis
    Data  Structure Document Column Graph
    Cache
    (memory  only)
    Database
    (memory/disk)
    Saturday, October 6, 12

    View Slide

  21. 21
    Couchbase  –  document-­‐oriented  database
    Key
    {
           “string”  :  “string”,
           “string”  :  value,
           “string”  :  
                         {    “string”  :  “string”,
                                 “string”  :  value  },
           “string”  :  [  array  ]
    }
    Auto-­‐sharding
    Disk-­‐based  with  built-­‐in  memcached  cache
    Cache  refill  on  restart
    Memcached  compa^ble  (drop  in  replace)
    Highly-­‐available  (data  replica^on)
    Add  or  remove  capacity  to  live  cluster
    When  values  are  JSON  objects  (“documents”):
    Create  indices,  views  and  query  against  the  views
    JSON
    OBJECT
    (“DOCUMENT”)
    Couchbase
    Saturday, October 6, 12

    View Slide

  22. 22
    NoSQL  catalog
    Key-­‐Value
    memcached
    membase
    redis
    Data  Structure Document Column Graph
    Cache
    (memory  only)
    Database
    (memory/disk)
    membase couchbase
    Saturday, October 6, 12

    View Slide

  23. 23
    MongoDB  –  Document-­‐oriented  database
    Key
    {
           “string”  :  “string”,
           “string”  :  value,
           “string”  :  
                         {    “string”  :  “string”,
                                 “string”  :  value  },
           “string”  :  [  array  ]
    }
    Disk-­‐based  with  in-­‐memory  “caching”
    BSON  (“binary  JSON”)  format  and  wire  protocol
    Master-­‐slave  replica^on
    Auto-­‐sharding
    Values  are  BSON  objects
    Supports  ad  hoc  queries  –  best  when  indexed
    BSON
    OBJECT
    (“DOCUMENT”)
    MongoDB
    Saturday, October 6, 12

    View Slide

  24. 24
    NoSQL  catalog
    Key-­‐Value
    memcached
    membase
    redis
    Data  Structure Document Column Graph
    mongoDB
    couchbase
    Cache
    (memory  only)
    Database
    (memory/disk)
    Saturday, October 6, 12

    View Slide

  25. 25
    Cassandra  –  Column  overlays
    Disk-­‐based  system
    Clustered  
    External  caching  required  for  low-­‐latency  reads
    “Columns”  are  overlaid  on  the  data
    Not  all  rows  must  have  all  columns
    Supports  efficient  queries  on  columns
    Restart  required  when  adding  columns
    Good  cross-­‐datacenter  support
    Cassandra
    Column  1
    Column  2
    Column  3  
    (not  present)  
    Saturday, October 6, 12

    View Slide

  26. 26
    NoSQL  catalog
    Key-­‐Value
    memcached
    membase
    redis
    Data  Structure Document Column Graph
    mongoDB
    couchbase cassandra
    Cache
    (memory  only)
    Database
    (memory/disk)
    Saturday, October 6, 12

    View Slide

  27. 27
    Neo4j  –  Graph  database
    Disk-­‐based  system
    External  caching  required  for  low-­‐latency  reads
    Nodes,  rela^onships  and  paths
    Proper^es  on  nodes
    Delete,  Insert,  Traverse,  etc.
    Neo4j
    Saturday, October 6, 12

    View Slide

  28. 28
    NoSQL  catalog
    Key-­‐Value
    memcached
    membase
    redis
    Data  Structure Document Column Graph
    mongoDB
    couchbase cassandra
    Cache
    (memory  only)
    Database
    (memory/disk)
    Neo4j
    Saturday, October 6, 12

    View Slide

  29. The  Landscape
    29
    Speed
    Scale
    Couchbase
    Redis
    S3
    Cassandra
    MongoDB
    Riak
    HBase
    CouchDB
    Neo4j
    SimpleDB
    memcached
    RDBMS
    Datomic
    Saturday, October 6, 12

    View Slide

  30. Datomic  -­‐  immutable  func^onal  data
    30
    Saturday, October 6, 12

    View Slide

  31. Hello  Couchbase  Server  2.0
    31
    Saturday, October 6, 12

    View Slide

  32. Couchbase  Server  2.0  beta
    32
    Saturday, October 6, 12

    View Slide

  33. 33
    Couchbase  handles  real  world  scale
    Saturday, October 6, 12

    View Slide

  34. (Really)  High  Performance
    34
    Latency
    less than 1/2 ms
    Throughput
    grows linearly with cluster size
    5 Nodes -- 1.75M operations per second
    Cisco and Solarflare benchmark of Couchbase Server
    Saturday, October 6, 12

    View Slide

  35. How  fast?
    35
    hrp://www.slideshare.net/renatko/couchbase-­‐performance-­‐benchmarking
    Saturday, October 6, 12

    View Slide

  36. Latency
    Saturday, October 6, 12

    View Slide

  37. Latency
    Saturday, October 6, 12

    View Slide

  38. COMPLEXITY  IS  THE  ENEMY
    37
    Saturday, October 6, 12

    View Slide

  39. 38
    Couchbase  Server  Basic  Opera^on
    COUCHBASE  CLIENT  LIBRARY
    §Docs  distributed  evenly  across  
    servers  in  the  cluster
    §Each  server  stores  both  ac)ve  &  
    replica  docs
    § Only  one  server  ac^ve  at  a  ^me
    §Client  library  provides  app  with  
    simple  interface  to  database
    §Cluster  map  provides  map  to  which  
    server  doc  is  on
    § App  never  needs  to  know
    § App  reads,  writes,  updates  docs
    § Mul^ple  App  Servers  can  access  same  
    document  at  same  ^me
    Doc  2
    Doc  5
    SERVER  1
    Doc  4
    SERVER  2
    Doc  1
    SERVER  3
    COUCHBASE  CLIENT  LIBRARY
    Doc  9
    Doc  7
    Doc  8 Doc  6
    Doc  3
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    Ac^ve  Docs Ac^ve  Docs Ac^ve  Docs
    CLUSTER  MAP CLUSTER  MAP
    APP  SERVER  1 APP  SERVER  2
    COUCHBASE  SERVER  CLUSTER
    Saturday, October 6, 12

    View Slide

  40. 38
    Couchbase  Server  Basic  Opera^on
    COUCHBASE  CLIENT  LIBRARY
    §Docs  distributed  evenly  across  
    servers  in  the  cluster
    §Each  server  stores  both  ac)ve  &  
    replica  docs
    § Only  one  server  ac^ve  at  a  ^me
    §Client  library  provides  app  with  
    simple  interface  to  database
    §Cluster  map  provides  map  to  which  
    server  doc  is  on
    § App  never  needs  to  know
    § App  reads,  writes,  updates  docs
    § Mul^ple  App  Servers  can  access  same  
    document  at  same  ^me
    Doc  4
    Doc  2
    Doc  5
    SERVER  1
    Doc  6
    Doc  4
    SERVER  2
    Doc  7
    Doc  1
    SERVER  3
    Doc  3
    COUCHBASE  CLIENT  LIBRARY
    Doc  9
    Doc  7
    Doc  8 Doc  6
    Doc  3
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    Doc  9
    Doc  5
    DOC
    DOC
    DOC
    Doc  1
    Doc  8 Doc  2
    Replica  Docs Replica  Docs Replica  Docs
    Ac^ve  Docs Ac^ve  Docs Ac^ve  Docs
    CLUSTER  MAP CLUSTER  MAP
    APP  SERVER  1 APP  SERVER  2
    COUCHBASE  SERVER  CLUSTER
    Saturday, October 6, 12

    View Slide

  41. 38
    Couchbase  Server  Basic  Opera^on
    COUCHBASE  CLIENT  LIBRARY
    §Docs  distributed  evenly  across  
    servers  in  the  cluster
    §Each  server  stores  both  ac)ve  &  
    replica  docs
    § Only  one  server  ac^ve  at  a  ^me
    §Client  library  provides  app  with  
    simple  interface  to  database
    §Cluster  map  provides  map  to  which  
    server  doc  is  on
    § App  never  needs  to  know
    § App  reads,  writes,  updates  docs
    § Mul^ple  App  Servers  can  access  same  
    document  at  same  ^me
    Doc  4
    Doc  2
    Doc  5
    SERVER  1
    Doc  6
    Doc  4
    SERVER  2
    Doc  7
    Doc  1
    SERVER  3
    Doc  3
    Read/Write/Update
    COUCHBASE  CLIENT  LIBRARY
    Read/Write/Update
    Doc  9
    Doc  7
    Doc  8 Doc  6
    Doc  3
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    Doc  9
    Doc  5
    DOC
    DOC
    DOC
    Doc  1
    Doc  8 Doc  2
    Replica  Docs Replica  Docs Replica  Docs
    Ac^ve  Docs Ac^ve  Docs Ac^ve  Docs
    CLUSTER  MAP CLUSTER  MAP
    APP  SERVER  1 APP  SERVER  2
    COUCHBASE  SERVER  CLUSTER
    Saturday, October 6, 12

    View Slide

  42. 39
    Add  Nodes  to  the  Cluster
    § Two  servers  added  to  cluster
    § One-­‐click  opera^on
    § Docs  automa^cally  
    rebalanced  across  cluster
    § Even  distribu^on  of  docs
    § Minimum  doc  movement
    § Cluster  map  updated
    § App  database  calls  now  
    distributed  over  larger  #  of  
    servers
    Doc  7
    Doc  9
    Doc  3
    Ac^ve  Docs
    Replica  Docs
    Doc  6
    COUCHBASE  CLIENT  LIBRARY
    CLUSTER  MAP
    APP  SERVER  1
    COUCHBASE  CLIENT  LIBRARY
    CLUSTER  MAP
    APP  SERVER  2
    Doc  4
    Doc  2
    Doc  5
    SERVER  1
    Doc  6
    Doc  4
    SERVER  2
    Doc  7
    Doc  1
    SERVER  3
    Doc  3
    Doc  9
    Doc  7
    Doc  8 Doc  6
    Doc  3
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    Doc  9
    Doc  5
    DOC
    DOC
    DOC
    Doc  1
    Doc  8 Doc  2
    Replica  Docs Replica  Docs Replica  Docs
    Ac^ve  Docs Ac^ve  Docs Ac^ve  Docs
    COUCHBASE  SERVER  CLUSTER
    Saturday, October 6, 12

    View Slide

  43. 39
    Add  Nodes  to  the  Cluster
    § Two  servers  added  to  cluster
    § One-­‐click  opera^on
    § Docs  automa^cally  
    rebalanced  across  cluster
    § Even  distribu^on  of  docs
    § Minimum  doc  movement
    § Cluster  map  updated
    § App  database  calls  now  
    distributed  over  larger  #  of  
    servers
    Doc  7
    Doc  9
    Doc  3
    Ac^ve  Docs
    Replica  Docs
    Doc  6
    COUCHBASE  CLIENT  LIBRARY
    CLUSTER  MAP
    APP  SERVER  1
    COUCHBASE  CLIENT  LIBRARY
    CLUSTER  MAP
    APP  SERVER  2
    Doc  4
    Doc  2
    Doc  5
    SERVER  1
    Doc  6
    Doc  4
    SERVER  2
    Doc  7
    Doc  1
    SERVER  3
    Doc  3
    Doc  9
    Doc  7
    Doc  8 Doc  6
    Doc  3
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    Doc  9
    Doc  5
    DOC
    DOC
    DOC
    Doc  1
    Doc  8 Doc  2
    Replica  Docs Replica  Docs Replica  Docs
    Ac^ve  Docs Ac^ve  Docs Ac^ve  Docs
    SERVER  4 SERVER  5
    Ac^ve  Docs Ac^ve  Docs
    Replica  Docs Replica  Docs
    COUCHBASE  SERVER  CLUSTER
    Saturday, October 6, 12

    View Slide

  44. 39
    Add  Nodes  to  the  Cluster
    § Two  servers  added  to  cluster
    § One-­‐click  opera^on
    § Docs  automa^cally  
    rebalanced  across  cluster
    § Even  distribu^on  of  docs
    § Minimum  doc  movement
    § Cluster  map  updated
    § App  database  calls  now  
    distributed  over  larger  #  of  
    servers
    Doc  7
    Doc  9
    Doc  3
    Ac^ve  Docs
    Replica  Docs
    Doc  6
    COUCHBASE  CLIENT  LIBRARY
    CLUSTER  MAP
    APP  SERVER  1
    COUCHBASE  CLIENT  LIBRARY
    CLUSTER  MAP
    APP  SERVER  2
    Doc  4
    Doc  2
    Doc  5
    SERVER  1
    Doc  6
    Doc  4
    SERVER  2
    Doc  7
    Doc  1
    SERVER  3
    Doc  3
    Doc  9
    Doc  7 Doc  8
    Doc  6
    Doc  3
    DOC
    DOC
    DOC
    DOC
    DOC DOC
    DOC
    DOC
    DOC
    DOC
    DOC DOC
    DOC
    DOC
    DOC
    Doc  9
    Doc  5
    DOC
    DOC
    DOC
    Doc  1
    Doc  8
    Doc  2
    Replica  Docs Replica  Docs Replica  Docs
    Ac^ve  Docs Ac^ve  Docs Ac^ve  Docs
    SERVER  4 SERVER  5
    Ac^ve  Docs Ac^ve  Docs
    Replica  Docs Replica  Docs
    COUCHBASE  SERVER  CLUSTER
    Saturday, October 6, 12

    View Slide

  45. 39
    Add  Nodes  to  the  Cluster
    § Two  servers  added  to  cluster
    § One-­‐click  opera^on
    § Docs  automa^cally  
    rebalanced  across  cluster
    § Even  distribu^on  of  docs
    § Minimum  doc  movement
    § Cluster  map  updated
    § App  database  calls  now  
    distributed  over  larger  #  of  
    servers
    Read/Write/Update Read/Write/Update
    Doc  7
    Doc  9
    Doc  3
    Ac^ve  Docs
    Replica  Docs
    Doc  6
    COUCHBASE  CLIENT  LIBRARY
    CLUSTER  MAP
    APP  SERVER  1
    COUCHBASE  CLIENT  LIBRARY
    CLUSTER  MAP
    APP  SERVER  2
    Doc  4
    Doc  2
    Doc  5
    SERVER  1
    Doc  6
    Doc  4
    SERVER  2
    Doc  7
    Doc  1
    SERVER  3
    Doc  3
    Doc  9
    Doc  7 Doc  8
    Doc  6
    Doc  3
    DOC
    DOC
    DOC
    DOC
    DOC DOC
    DOC
    DOC
    DOC
    DOC
    DOC DOC
    DOC
    DOC
    DOC
    Doc  9
    Doc  5
    DOC
    DOC
    DOC
    Doc  1
    Doc  8
    Doc  2
    Replica  Docs Replica  Docs Replica  Docs
    Ac^ve  Docs Ac^ve  Docs Ac^ve  Docs
    SERVER  4 SERVER  5
    Ac^ve  Docs Ac^ve  Docs
    Replica  Docs Replica  Docs
    COUCHBASE  SERVER  CLUSTER
    Saturday, October 6, 12

    View Slide

  46. 40
    Fail  Over  Node
    COUCHBASE  CLIENT  LIBRARY
    CLUSTER  MAP
    COUCHBASE  CLIENT  LIBRARY
    CLUSTER  MAP
    Doc  4
    Doc  2
    Doc  5
    SERVER  1
    Doc  6
    Doc  4
    SERVER  2 SERVER  3
    Doc  3
    Doc  9
    Doc  7 Doc  8
    Doc  6
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC DOC
    DOC
    Doc  5
    DOC
    DOC
    DOC
    Doc  1
    Doc  8
    Doc  2
    Replica  Docs Replica  Docs Replica  Docs
    Ac^ve  Docs Ac^ve  Docs Ac^ve  Docs
    SERVER  4 SERVER  5
    Ac^ve  Docs Ac^ve  Docs
    Replica  Docs Replica  Docs
    COUCHBASE  SERVER  CLUSTER
    § App  servers  happily  accessing  docs  on  
    Server  3
    § Server  fails
    § App  server  requests  to  server  3  fail
    § Cluster  detects  server  has  failed
    § Promotes  replicas  of  docs  to  ac)ve
    § Updates  cluster  map
    § App  server  requests  for  docs  now  go  to  
    appropriate  server
    § Typically  rebalance  would  follow
    Doc  7
    Doc  9
    Doc  3
    Ac^ve  Docs
    Replica  Docs
    Doc  6
    Doc  7
    Doc  1
    Doc  3
    DOC
    DOC
    Doc  9
    DOC
    DOC
    Saturday, October 6, 12

    View Slide

  47. 40
    Fail  Over  Node
    COUCHBASE  CLIENT  LIBRARY
    CLUSTER  MAP
    COUCHBASE  CLIENT  LIBRARY
    CLUSTER  MAP
    Doc  4
    Doc  2
    Doc  5
    SERVER  1
    Doc  6
    Doc  4
    SERVER  2 SERVER  3
    Doc  3
    Doc  9
    Doc  7 Doc  8
    Doc  6
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC DOC
    DOC
    Doc  5
    DOC
    DOC
    DOC
    Doc  1
    Doc  8
    Doc  2
    Replica  Docs Replica  Docs Replica  Docs
    Ac^ve  Docs Ac^ve  Docs Ac^ve  Docs
    SERVER  4 SERVER  5
    Ac^ve  Docs Ac^ve  Docs
    Replica  Docs Replica  Docs
    COUCHBASE  SERVER  CLUSTER
    § App  servers  happily  accessing  docs  on  
    Server  3
    § Server  fails
    § App  server  requests  to  server  3  fail
    § Cluster  detects  server  has  failed
    § Promotes  replicas  of  docs  to  ac)ve
    § Updates  cluster  map
    § App  server  requests  for  docs  now  go  to  
    appropriate  server
    § Typically  rebalance  would  follow
    Doc  7
    Doc  9
    Doc  3
    Ac^ve  Docs
    Replica  Docs
    Doc  6
    Doc  7
    Doc  1
    Doc  3
    DOC
    DOC
    Doc  9
    DOC
    DOC
    Saturday, October 6, 12

    View Slide

  48. 40
    Fail  Over  Node
    COUCHBASE  CLIENT  LIBRARY
    CLUSTER  MAP
    COUCHBASE  CLIENT  LIBRARY
    CLUSTER  MAP
    Doc  4
    Doc  2
    Doc  5
    SERVER  1
    Doc  6
    Doc  4
    SERVER  2 SERVER  3
    Doc  3
    Doc  9
    Doc  7 Doc  8
    Doc  6
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC DOC
    DOC
    Doc  5
    DOC
    DOC
    DOC
    Doc  1
    Doc  8
    Doc  2
    Replica  Docs Replica  Docs Replica  Docs
    Ac^ve  Docs Ac^ve  Docs Ac^ve  Docs
    SERVER  4 SERVER  5
    Ac^ve  Docs Ac^ve  Docs
    Replica  Docs Replica  Docs
    COUCHBASE  SERVER  CLUSTER
    § App  servers  happily  accessing  docs  on  
    Server  3
    § Server  fails
    § App  server  requests  to  server  3  fail
    § Cluster  detects  server  has  failed
    § Promotes  replicas  of  docs  to  ac)ve
    § Updates  cluster  map
    § App  server  requests  for  docs  now  go  to  
    appropriate  server
    § Typically  rebalance  would  follow
    Doc  7
    Doc  9
    Doc  3
    Ac^ve  Docs
    Replica  Docs
    Doc  6
    Doc  7
    Doc  1
    Doc  3
    DOC
    DOC
    Doc  9
    DOC
    DOC
    Saturday, October 6, 12

    View Slide

  49. 40
    Fail  Over  Node
    COUCHBASE  CLIENT  LIBRARY
    CLUSTER  MAP
    COUCHBASE  CLIENT  LIBRARY
    CLUSTER  MAP
    Doc  4
    Doc  2
    Doc  5
    SERVER  1
    Doc  6
    Doc  4
    SERVER  2 SERVER  3
    Doc  3
    Doc  9
    Doc  7 Doc  8
    Doc  6
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC DOC
    DOC
    Doc  5
    DOC
    DOC
    DOC
    Doc  1
    Doc  8
    Doc  2
    Replica  Docs Replica  Docs Replica  Docs
    Ac^ve  Docs Ac^ve  Docs Ac^ve  Docs
    SERVER  4 SERVER  5
    Ac^ve  Docs Ac^ve  Docs
    Replica  Docs Replica  Docs
    COUCHBASE  SERVER  CLUSTER
    § App  servers  happily  accessing  docs  on  
    Server  3
    § Server  fails
    § App  server  requests  to  server  3  fail
    § Cluster  detects  server  has  failed
    § Promotes  replicas  of  docs  to  ac)ve
    § Updates  cluster  map
    § App  server  requests  for  docs  now  go  to  
    appropriate  server
    § Typically  rebalance  would  follow
    Doc  7
    Doc  9
    Doc  3
    Ac^ve  Docs
    Replica  Docs
    Doc  6
    Doc  7
    Doc  1
    Doc  3
    DOC
    DOC
    Doc  9
    DOC
    DOC
    Saturday, October 6, 12

    View Slide

  50. 40
    Fail  Over  Node
    COUCHBASE  CLIENT  LIBRARY
    CLUSTER  MAP
    COUCHBASE  CLIENT  LIBRARY
    CLUSTER  MAP
    Doc  4
    Doc  2
    Doc  5
    SERVER  1
    Doc  6
    Doc  4
    SERVER  2 SERVER  3
    Doc  3
    Doc  9
    Doc  7 Doc  8
    Doc  6
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC
    DOC DOC
    DOC
    Doc  5
    DOC
    DOC
    DOC
    Doc  1
    Doc  8
    Doc  2
    Replica  Docs Replica  Docs Replica  Docs
    Ac^ve  Docs Ac^ve  Docs Ac^ve  Docs
    SERVER  4 SERVER  5
    Ac^ve  Docs Ac^ve  Docs
    Replica  Docs Replica  Docs
    COUCHBASE  SERVER  CLUSTER
    § App  servers  happily  accessing  docs  on  
    Server  3
    § Server  fails
    § App  server  requests  to  server  3  fail
    § Cluster  detects  server  has  failed
    § Promotes  replicas  of  docs  to  ac)ve
    § Updates  cluster  map
    § App  server  requests  for  docs  now  go  to  
    appropriate  server
    § Typically  rebalance  would  follow
    Saturday, October 6, 12

    View Slide

  51. 41
    ●Suddenly, disk writes all began to time out
    ●Many services experienced outages:
    ● FourSquare, Reddit, Quora, among others
    ●With memory buffered writes, a scalable
    data layer keeps working
    ● When EBS came back online, Couchbase wrote all
    the updated data to disk without missing a beat.
    War  Story:  EBS  Outage
    Saturday, October 6, 12

    View Slide

  52. 42
    Cross  Data  Center  Replica^on
    §Data  close  to  users
    §Mul^ple  loca^ons  for  disaster  recovery
    §Independently  managed  clusters  serving  local  data
    US  DATA  CENTER EUROPE  DATA  CENTER ASIA  DATA  CENTER
    Replica7on Replica7on
    Replica7on
    Saturday, October 6, 12

    View Slide

  53. 43
    Built  for  Produc^on
    Saturday, October 6, 12

    View Slide

  54. JSON  DOCUMENT  DATABASE
    44
    Saturday, October 6, 12

    View Slide

  55. 45
    Document  Database  as  Aggregate  Database
    hrp://mar^nfowler.com/bliki/AggregateOrientedDatabase.html
    Saturday, October 6, 12

    View Slide

  56. 46
    Document  Database
    This synergy between the programming model and the distribution model is
    very valuable. It allows the database to use its knowledge of how the
    application programmer clusters the data to help performance across the
    cluster.
    hrp://mar^nfowler.com/bliki/AggregateOrientedDatabase.html
    o::1001
    {
    uid:  ji22jd,
    customer:  Ann,
    line_items:  [  
    {  sku:  0321293533,  quan:  3,    unit_price:  48.0  },
    {  sku:  0321601912,  quan:  1,  unit_price:  39.0  },
    {  sku:  0131495054,  quan:  1,  unit_price:  51.0  }  
    ],
    payment:  {  type:  Amex,  expiry:  04/2001,  
    last5:  12345  }
    }
    Saturday, October 6, 12

    View Slide

  57. Developers  <3  JSON
    47
    Saturday, October 6, 12

    View Slide

  58. 4
    LET’S  GET  POST-­‐RELATIONAL!
    Saturday, October 6, 12

    View Slide

  59. 49
    JSON  Documents
    • Maps  more  closely  to  external  API
    • CRUD  Opera^ons,  lightweight  schema
    • Stored  under  an  iden^fier  key
    {
    “fields” : [“with basic types”, 3.14159, true],
    “like” : “your favorite language”
    }
    client.set(“mydocumentid”, myDocument);
    mySavedDocument = client.get(“mydocumentid”);
    Saturday, October 6, 12

    View Slide

  60. Meta  +  Document  Body
    50
    {
         "brewery":  "New  Belgium  Brewing",
         "name":  "1554  Enlightened  Black  Ale",
         "abv":  5.5,
         "descrip7on":  "Born  of  a  flood...",
         "category":  "Belgian  and  French  Ale",
         "style":  "Other  Belgian-­‐Style  Ales",
         "updated":  "2010-­‐07-­‐22  20:00:20"
    }
    {
         "id"  :  "beer_Enlightened_Black_Ale”,
             ...
    {
    Document
    user data,
    can be anything
    unique ID
    Metadata
    identifier,
    expiration, etc
    “vintage” date format from an SQL dump
    >_<
    Saturday, October 6, 12

    View Slide

  61. {
         "brewery":  "New  Belgium  
         "name":  "1554  Enlightene
         "abv":  5.5,
         "descrip7on":  "Born  of  a  fl
         "category":  "Belgian  and  F
         "style":  "Other  Belgian-­‐Sty
         "updated":  "2010-­‐07-­‐22  2
       “ra7ngs”  :  {
           “525”  :  5,
           “30”  :  4,
           “1044”  :  2
     },
       “comment_ids”  :  [
             “f1e62”,
             “6ad8c”
         ]
    }
    Add  comments  to  the  beer
    {
         "type":  "comment",
         "about_id":  "beer_Enlightened_Black_Ale",
         "user_id":  525,
         "text":  "tastes  like  college!",
         "updated":  "2010-­‐07-­‐22  20:00:20"
    }
    link to
    comments
    link to
    beer
    {
         "id":  "f1e62"
    }
    Saturday, October 6, 12

    View Slide

  62. 52
    How  to:  look  up  comments  from  a  beer
    • SERIALIZED  LOOP  
    figure  hrp://www.ibm.com/developerworks/webservices/library/ws-­‐sdoarch/
    beer = client.get(“beer:A_cold_one”);
    beer.comment_ids.each { |id|
    comments.push(client.get(id));
    }
    • FAST  MULTI-­‐KEY  LOOKUP
    beer = client.get(“beer:A_cold_one”);
    comments = client.multiGet(beer.comment_ids)
    • ASYNC  VIEW  QUERY
    comments = client.query(“myapp”,“by_comment_on”,
    {:key => “beer:A_cold_one”});
    Saturday, October 6, 12

    View Slide

  63. 53
    Emergent  Schema
    JSON.org
    Github  API
    Twiqer  API
    "Capture  the  user's  intent"
    • The  database  can  handle  it
    • Your  app  controls  the  schema
    Saturday, October 6, 12

    View Slide

  64. Audience  par^cipa^on!  *that  means  you
    54
    npm install twitterfight
    npm start twitterfight
    Saturday, October 6, 12

    View Slide

  65. @jchris
    hrp://www.couchbase.com/
    Chris  Anderson
    Thank  You!
    55
    Saturday, October 6, 12

    View Slide

  66. 5
    INCREMENTAL  MAP-­‐REDUCE
    FOR  REALTIME  ANALYTICS
    Saturday, October 6, 12

    View Slide

  67. What  do  you  mean  “Incremental?”
    like:
    CREATE INDEX city ON brewery city;
    57
    Saturday, October 6, 12

    View Slide

  68. 5
    QUERY  PATTERN:
    FIND  BY  ATTRIBUTE
    Saturday, October 6, 12

    View Slide

  69. 59
    Find  documents  by  a  specific  arribute
    • Lets  find  beers  by  brewery_id!
    Saturday, October 6, 12

    View Slide

  70. 60
    The  index  defini^on
    Saturday, October 6, 12

    View Slide

  71. 61
    The  result  set:  beers  keyed  by  brewery_id
    Saturday, October 6, 12

    View Slide

  72. 6
    QUERY  PATTERN:
    BASIC  AGGREGATIONS
    Saturday, October 6, 12

    View Slide

  73. 63
    Use  a  built-­‐in  reduce  func^on  with  a  group  query
    • Lets  find  average  abv  for  each  brewery!
    Saturday, October 6, 12

    View Slide

  74. 64
    We  are  reducing  doc.abv  with  _stats
    Saturday, October 6, 12

    View Slide

  75. 65
    Group  reduce  (reduce  by  unique  key)
    Saturday, October 6, 12

    View Slide

  76. 6
    QUERY  PATTERN:
    TIME-­‐BASED  ROLLUPS
    WITH  KEY  GROUPING
    Saturday, October 6, 12

    View Slide

  77. group_level=3  -­‐  daily  results  -­‐  great  for  graphing
    67
    • Daily,  hourly,  minute  or  second  rollup  all  possible  with  
    the  same  index.
    • hrp://crate.im/posts/couchbase-­‐views-­‐reddit-­‐data/
    Saturday, October 6, 12

    View Slide

  78. 6
    GEO  INDEX  &  
    FULL  TEXT  INTEGRATION
    Saturday, October 6, 12

    View Slide

  79. 6
    GeoCouch  R-­‐Tree  Index
    • Op^mized  for  bulk  loading  of  large  data  sets
    • Simplified  query  model  (bounding  box,  nearest  neighbor)
    Saturday, October 6, 12

    View Slide

  80. Elas^c  Search  Adapter
    70
    • Elas^c  Search  is  good  for  ad-­‐hoc  queries  and  faceted  browsing
    • Our  adapter  is  aware  of  changing  Couchbase  topology
    • Indexed  by  Elas^c  Search  aOer  stored  to  disk  in  Couchbase
    Saturday, October 6, 12

    View Slide

  81. @jchris
    hrp://www.couchbase.com/
    Chris  Anderson
    Thank  You!
    71
    Saturday, October 6, 12

    View Slide