Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Introduction to Couchbase

Alex Objelean
September 28, 2015

Introduction to Couchbase

Alex Objelean

September 28, 2015
Tweet

More Decks by Alex Objelean

Other Decks in Technology

Transcript

  1. Agenda • Why NoSQL? • Couchbase Features and Concepts •

    Architecture • Accessing Data • Performance • Monitoring • Real World Use Cases • Conclusions
  2. Why NoSQL? Aspects to consider • Nature of Data •

    Application Development • Operational issues (data volume & scalability) • Data warehouse and analytics
  3. Couchbase Server Couchbase Server is a NoSQL document database for

    interactive web applications. It has a flexible data model, is easily scalable, provides consistent high performance and is capable of serving application data with 100% uptime. Merge of two popular NOSQL technologies: • Membase - which provides persistence, replication, sharding to the high performance memcached technology • CouchDB - which pioneers the document oriented model based on JSON
  4. Couchbase Features • Flexible Data Model • Easy Scalability •

    Easy Development Integration • Consistent high performance • Reliable and secure
  5. Couchbase - Concepts • Document Store • Data Buckets •

    vBuckets • Keys and Metadata • Couchbase SDK
  6. Couchbase - Concepts (Keys and Metadata) Keys - Unique Identifier

    of a document (similar to SQL primary key) Metadata • CAS - form of basic optimistic concurrency • TTL - control document expiry • Flags - variety of options for additional document processing
  7. Architecture • Clusters involving multiple machines • Data sharded across

    machines in a cluster. • Client connect to appropriate servers
  8. Architecture • Hash sharding • 1024 partitions • Each partition

    assigned to a node • Auto rebalancing on topology change
  9. Architecture • Resilience enhanced by document replication • Cluster Manager

    coordinates communication of replicated data • Data manager supervise replica data assignment • Replica data distributed across all nodes in cluster • Documents placed into bucket. • Configurable number of replica per bucket • Clients in constant communication with server. • Clients participate in load-balancing • Clients react to topology change. • Orchestrator - elected node responsible for cluster configuration arbitration. • Topology change communicated to all clients • When orchestrator goes down - new node elected.
  10. Acessing Data • Key pattern ◦ Equivalent to map.get(key). O(1)

    complexity. ◦ Multi-get for batch retrieval • Views (Indexes) ◦ Mechanism for query data ◦ Map-reduce functions (JavaScript) ◦ Eventual consistent ◦ Incremental updates
  11. Performance and consistency (1) Traditional approach Only one node is

    primary: • Cannot leverage secondary nodes • Reading from secondary nodes: ◦ Fast but inconsistent (async replication) ◦ Consistent but slow (sync replication) Consistency - ensured by interaction (read/write operations) with primary nodes only.
  12. Performance and consistency (2) Consistency - ensured by interaction (read/write

    operations) with primary nodes only. Couchbase approach Every node is a primary node (for data subset): • Ensured consistency • All nodes are leveraged
  13. Conclusions • Choose carefully (SQL vs NoSQL) ◦ Aspects to

    consider: scalability, performance, flexible data model • Couchbase - important NoSQL player ◦ Document store ◦ Scales (up & down) ◦ Provides consistency and performance • Couchbase is not a silver bullet ◦ but a perfect match for certain types of applications.