Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Softshake: Introduction to NoSQL with Couchbase

Softshake: Introduction to NoSQL with Couchbase

Presentation used during Softshare 2013

Tugdual Grall

October 24, 2013
Tweet

More Decks by Tugdual Grall

Other Decks in Technology

Transcript

  1. • Tugdual  “Tug”  Grall -­‐ Couchbase -­‐ Technical  Evangelist -­‐

    eXo -­‐ CTO -­‐ Oracle -­‐ Developer/Product   Manager -­‐ Mainly  Java/SOA -­‐ Developer  in  consulAng  firms • Web -­‐    @tgrall -­‐      hEp://blog.grallandco.com -­‐      tgrall • NantesJUG  co-­‐founder • Pet  Project  : • hEp://www.resultri.com • [email protected][email protected] {“about”  :  “me”} Monday, October 28, 13
  2. Growth  is  the  New  Reality • Instagram  gained  nearly  1

     million  users  overnight  when  then   expanded  to  Android Monday, October 28, 13
  3. Draw  Something  by  OMGPOP Daily  Ac)ve  Users  (millions) 19 17

    15 13 11 9 7 5 3 3/1 28 26 24 22 20 18 16 14 12 10 8 2/6 21 2 4 6 8 10 12 14 16 50  Million  Users  in  50  Days Monday, October 28, 13
  4. RDBMS  is  good  for  many  thing,  but  hard  to  scale

    RDBMS  Scales  Up Get  a  bigger,  more  complex  server Users Applica;on  Scales  Out Just  add  more  commodity  web  servers Users System  Cost Applica;on  Performance   Rela2onal  Database Web/App  Server  Tier System  Cost Applica;on  Performance   Won’t  scale   beyond  this   point How  do  you  take  this  growth? Monday, October 28, 13
  5. Web/App  Server  Tier Memcached  Tier MySQL  Tier Scaling  out  RDBMS

    • Run  Many  SQL  Servers • Data  could  be  sharded ­ Done  by  the  applicaAon  code • Caching  for  faster  response  )me Monday, October 28, 13
  6. Scaling  out  fla?ens  the  cost  and  performance  curves NoSQL  Database

     Scales  Out Cost  and  performance  mirrors  app  ;er Users NoSQL  Distributed  Data  Store Web/App  Server  Tier Applica;on  Scales  Out Just  add  more  commodity  web  servers Users System  Cost Applica;on  Performance   Applica;on  Performance   System  Cost NoSQL  Technology  Scales  Out Monday, October 28, 13
  7. Dynamo October  2007 Cassandra August  2008 Bigtable November  2006 Voldemort

    February  2009 Very  few  organiza)ons  want  to  (fewer  can)  build  and  maintain  database  so]ware  technology. But  every  organiza)on  building  interac)ve  web  applica)ons  needs  this  technology. A  New  Technology? Building  new  database  to  answer  the  following  requirements • No  schema  required  before  inser)ng  data • No  schema  change  required  to  change  data  format • Auto-­‐sharding  without  applica)on  par)cipa)on • Distributed  queries • Integrated  main  memory  caching • Data  synchroniza)on  (  mul)-­‐datacenter) Monday, October 28, 13
  8. Lack  of  flexibility/ rigid  schemas Inability  to  scale   out

     data Performance  challenges Cost All  of  these Other 49% 35% 29% 16% 12% 11% Source:  Couchbase  Survey,  December  2011,  n  =  1351. What  Is  Biggest  Data  Management  Problem  Driving   Use  of  NoSQL  in  Coming  Year? Monday, October 28, 13
  9. Key-­‐Value Memcached Membase Redis Data  Structure Document Column Graph MongoDB

    Couchbase Cassandra Cache (memory  only) Database (memory/disk) Neo4j HBase InfiniteGraph Coherence NoSQL  Catalog Monday, October 28, 13
  10. Cloudera Hortonworks Mapr Couchbase MongoDB Cassandra Hbase AnalyPc Databases Get

     insights  from   data Real-­‐Pme,   InteracPve  Databases Fast  access   to  data NoSQL Opera)onal  vs.  Analy)c  Databases Monday, October 28, 13
  11. Rela;onal  vs  Document  Data  Model RelaPonal  data  model Document  data

     model Collec;on  of  complex  documents  with arbitrary,  nested  data  formats  and varying  “record”  format. Highly-­‐structured  table  organiza;on  with   rigidly-­‐defined  data  formats  and  record   structure. JSON JSON C1 C2 C3 C4 JSON { } Monday, October 28, 13
  12. Document  Database o::1001 { uid: “ji22jd”, customer: “Ann”, line_items: [

    { sku: 0321293533, qty: 2, price: 48.0 }, { sku: 0321601912, qty: 1, price: 39.0 }, { sku: 0131495054, qty: 1, price: 51.0 } ], payment: { type: “Amex”, expiry: “04/2001”, last5: 12345 } } Aggregate  Oriented  Database hEp://marAnfowler.com/bliki/AggregateOrientedDatabase.html   Monday, October 28, 13
  13. Schema  Update  :  RDBMS Speaker Robin  Johnson Tug  Grall John

     Zablo Event  Info ID Title Speaker_id Frank  Weigel 1 Paris  JUG rbin 2 QCON  UK tgrall 3 NoSQL  CLN johnz 4 Devoxx fweigel ID Name rbin tgrall johnz fweigel A  talk  could  be  done  by  one  or  more  speakers! 1.  Create  a  new  table 2.“Prepare”  the  Data    (move  to  new  table,  drop  column) 3.  Change  the  constraints   4.  Change  the  app Monday, October 28, 13
  14. Schema  Update  :  Document event:1 { type: “event”, title: “Paris

    JUG”, speaker: “rbin” } event:1 { type: “event”, title: “Paris JUG”, speaker: “rbin” } event:1 { type: “event”, title: “Paris JUG”, speaker: “rbin” } event:1 { type: “event”, title: “Paris JUG”, speaker: “rbin” } event:1 { type: “event”, title: “Paris JUG”, speaker: “rbin” } event:1 { type: “event”, title: “Paris JUG”, speaker: [“rbin”,”tgrall”] } user:rbin { type: “user”, name: “Robin Johnson”, } user:rbin { type: “user”, name: “Robin Johnson”, } user:rbin { type: “user”, name: “Robin Johnson”, } 1. Change  The  ApplicaPon • Update  the  Data  when  “you”  want • Change  the  index  to  query  the  new  format Monday, October 28, 13
  15. Easy   Scalability Consistent  High   Performance Always  On  

    24x365 Grow  cluster  without  applicaAon   changes,  without  downAme  with   a  single  click Consistent  sub-­‐millisecond   read  and  write  response  Ames   with  consistent  high  throughput No  downAme  for  sodware   upgrades,  hardware  maintenance,   etc. Flexible  Data   Model JSON  document  model  with  no   fixed  schema. JSON JSON JSON JSON JSON PERFORMANCE Couchbase  Server  Core  Principles Monday, October 28, 13
  16. Heartbeat Process  monitor Global  singleton  supervisor Configura;on  manager on  each

     node Rebalance  orchestrator Node  health  monitor one  per  cluster vBucket  state  and  replica;on  manager hRp REST  management  API/Web  UI HTTP 8091 Erlang  port  mapper 4369 Distributed  Erlang 21100  -­‐  21199 Erlang/OTP storage  interface Couchbase  EP  Engine 11210 Memcapable    2.0 11211 Memcapable    1.0 Memcached New  Persistence  Layer 8092 Query  API Query  Engine Data  Manager Cluster  Manager Couchbase  Architecture Moxi Monday, October 28, 13
  17. New  Persistence  Layer storage  interface Couchbase  EP  Engine 11210 Memcapable

       2.0 11211 Memcapable    1.0 Object-­‐level  Cache Disk  Persistence 8092 Query  API Query  Engine HTTP 8091 Erlang  port  mapper 4369 Distributed  Erlang 21100  -­‐  21199 Heartbeat Process  monitor Global  singleton  supervisor Configura;on  manager on  each  node Rebalance  orchestrator Node  health  monitor one  per  cluster vBucket  state  and  replica;on  manager hRp REST  management  API/Web  UI Erlang/OTP Server/Cluster   Management  &   Communica2on (Erlang) The Unreasonable Effectiveness of C by Damien Katz Couchbase  Architecture Moxi RAM  Cache,  Indexing   &  Persistence   Management (C  &  V8) Monday, October 28, 13
  18. Couchbase  Server    Cluster Cluster-­‐wide  Basic  Opera)on • Docs  distributed

     evenly  across   servers   • Each  server  stores  both  ac)ve  and   replica  docs Only  one  server  acAve  at  a  Ame • Client  library  provides  app  with   simple  interface  to  database • Cluster  map  provides  map   to  which  server  doc  is  on App  never  needs  to  know • App  reads,  writes,  updates  docs • Mul)ple  app  servers  can  access  same   document  at  same  )me User  Configured  Replica  Count  =  1 READ/WRITE/UPDATE Ac2ve Doc  5 Doc  2 Doc Doc Doc Server  1 Ac2ve Doc  4 Doc  7 Doc Doc Doc Server  2 Doc  8 Ac2ve Doc  1 Doc  2 Doc Doc Doc REPLICA Doc  4 Doc  1 Doc  8 Doc Doc Doc REPLICA Doc  6 Doc  3 Doc  2 Doc Doc Doc REPLICA Doc  7 Doc  9 Doc  5 Doc Doc Doc Server  3 Doc  6 App  Server  1 COUCHBASE  Client  Library Cluster  Map COUCHBASE  Client  Library Cluster  Map App  Server  2 Doc  9 Monday, October 28, 13
  19. Add  Nodes  to  Cluster • Two  servers  added One-­‐click  opera)on

    • Docs  automa)cally   rebalanced  across   cluster Even  distribuAon  of  docs Minimum  doc  movement • Cluster  map  updated • App  database   calls  now  distributed   over  larger  number  of   servers REPLICA Ac2ve Doc  5 Doc  2 Doc Doc Doc  4 Doc  1 Doc Doc Server  1 REPLICA Ac2ve Doc  4 Doc  7 Doc Doc Doc  6 Doc  3 Doc Doc Server  2 REPLICA Ac2ve Doc  1 Doc  2 Doc Doc Doc  7 Doc  9 Doc Doc Server  3 Server  4 Server  5 REPLICA Ac2ve REPLICA Ac2ve Doc Doc  8 Doc Doc  9 Doc Doc  2 Doc Doc  8 Doc Doc  5 Doc Doc  6 READ/WRITE/UPDATE READ/WRITE/UPDATE App  Server  1 COUCHBASE  Client  Library Cluster  Map COUCHBASE  Client  Library Cluster  Map App  Server  2 Couchbase  Server    Cluster User  Configured  Replica  Count  =  1 Monday, October 28, 13
  20. Fail  Over  Node REPLICA Ac2ve Doc  5 Doc  2 Doc

    Doc Doc  4 Doc  1 Doc Doc Server  1 REPLICA Ac2ve Doc  4 Doc  7 Doc Doc Doc  6 Doc  3 Doc Doc Server  2 REPLICA Ac2ve Doc  1 Doc  2 Doc Doc Doc  7 Doc  9 Doc Doc Server  3 Server  4 Server  5 REPLICA Ac2ve REPLICA Ac2ve Doc  9 Doc  8 Doc Doc  6 Doc Doc Doc  5 Doc Doc  2 Doc  8 Doc Doc • App  servers  accessing  docs • Requests  to  Server  3  fail • Cluster  detects  server  failed Promotes  replicas  of  docs  to   acAve Updates  cluster  map • Requests  for  docs  now  go  to   appropriate  server • Typically  rebalance   would  follow Doc Doc  1 Doc  3 App  Server  1 COUCHBASE  Client  Library Cluster  Map COUCHBASE  Client  Library Cluster  Map App  Server  2 User  Configured  Replica  Count  =  1 Couchbase  Server    Cluster Monday, October 28, 13
  21. COUCHBASE  SERVER    CLUSTER ACTIVE Doc  5 Doc  2 Doc

    Doc Doc SERVER  1 REPLICA Doc  4 Doc  1 Doc  8 Doc Doc Doc APP  SERVER  1 COUCHBASE  Client  Library CLUSTER  MAP COUCHBASE  Client  Library CLUSTER  MAP APP  SERVER  2 Doc  9 • Indexing  work  is  distributed  amongst   nodes • Large  data  set  possible • Parallelize  the  effort • Each  node  has  index  for  data  stored  on  it • Queries  combine  the  results  from  required   nodes ACTIVE Doc  5 Doc  2 Doc Doc Doc SERVER  2 REPLICA Doc  4 Doc  1 Doc  8 Doc Doc Doc Doc  9 ACTIVE Doc  5 Doc  2 Doc Doc Doc SERVER  3 REPLICA Doc  4 Doc  1 Doc  8 Doc Doc Doc Doc  9 Query Indexing  and  Querying Monday, October 28, 13
  22. • Elastic Search is good for ad-hoc queries and faceted

    browsing • Our adapter is aware of changing Couchbase topology • Indexed by Elastic Search after stored to disk in Couchbase ElasPcSearch Full  Text  Search Monday, October 28, 13
  23. Contact  me  on  Twi?er @tgrall Contact  me  by  Email [email protected]

    IRC #couchbase #libcouchbase Couchbase  Docs www.couchbase.com/docs/ Couchbase  Forums www.couchbase.com/communi;es Meetup hbp://meetup.com/Couchbase-­‐France/ Mobile  (beta) hbp://mobile.couchbase.com/ N1QL  -­‐  New  Query  Engine  (DP) hbp://query.couchbase.com Monday, October 28, 13