Slide 1

Slide 1 text

NoSQL Night! Singapore Spring@Pivotal User Group

Slide 2

Slide 2 text

About the Speaker •  Clarence J M Tauro – [email protected] –  Senior Instructor, Couchbase –  ~11 Years Professional Teaching and Consulting Experience –  Worked at Pivotal – Instructor/Consultant for Spring/Spring Security/Spring Web/Enterprise Integration with Spring/Spring JMS/Spring Web/Spring Batch, Pivotal Hadoop/Cloud Foundry –  PhD in Computer Science from Christ University [thesis accepted] –  Hard-core Dog lover

Slide 3

Slide 3 text

Disclaimer •  Disclaimer: The views expressed in this presentation are our own and do not necessarily reflect the views of Couchbase

Slide 4

Slide 4 text

Objectives •  Introduction to NoSQL •  Are ACID Properties always desirable? •  Basically available, Soft state, Eventually consistent (BASE) •  The CAP Theorem •  Introducing Couchbase •  Couchbase Operations

Slide 5

Slide 5 text

Introduction RDBMS - predominant technology for storing structured data in web and business applications “one size fits all” - thinking concerning data-stores has been questioned Apply NoSQL databases for the persistence layer/Polyglot Programming

Slide 6

Slide 6 text

ACID Properties • ATOMICITY • CONSISTENCY • ISOLATION • DURABILITY

Slide 7

Slide 7 text

Are ACID Properties always desirable? •  … But what about: –  Latency –  Partition Tolerance –  High Availability –  Scalability

Slide 8

Slide 8 text

the system is available, but not necessarily all items in it at any given point in time after a certain time all nodes are consistent, but at any given time this might not be the case information (state) the user put into the system that will go away if the user doesn't maintain it BASE

Slide 9

Slide 9 text

NoSQL Common Traits •  Non-relational •  Schema-free/Schema-on-read •  Eventual consistency •  Open source •  Distributed •  “web-scale”

Slide 10

Slide 10 text

The CAP Theorem •  Consistency – can all nodes see identical data, at all times? •  Availability – can all nodes be read from and written to, at all times? •  Partition Tolerance – will nodes function normally, even when the cluster breaks? Consistency Partition Tolerance Availability CHOOSE ANY TWO

Slide 11

Slide 11 text

The CAP Theorem •  CP: Consistency and Partition Tolerance -  Immediately consistent data across a horizontally scaled cluster, even with network problems -  Couchbase •  AP: Availability and Partition Tolerance -  Always services requests, across multiple data centers, even with network problems, data eventually consistent -  Apache HBase or Cassandra, Couchbase (XDCR) •  CA: Consistency and Availability -  Always services requests with immediately consistent data, in a vertically scaled system -  MySQL, Oracle, Microsoft SQL Server

Slide 12

Slide 12 text

What do you do with the Data? Operational Use •  Real time intelligence •  Focus on data flows and processes •  Extremely fast (in- memory) reads •  Extremely fast (log append) writes •  Improve the current outcome Analytical Use •  Batched workloads •  Vast data aggregations •  Retrospective analyses •  Focus on data pools •  Improve future outcomes

Slide 13

Slide 13 text

Hadoop vs. NoSQL Operational Velocity Analytical Volume Real-time operational database systems improve current outcomes Batch-oriented analytical database systems improve future outcomes Hadoop NoSQL

Slide 14

Slide 14 text

Types of NoSQL •  Key-value stores •  Wide Column stores •  Document stores •  Graph databases

Slide 15

Slide 15 text

Key-Value Stores •  The most common; not-necessarily the most popular •  Key and a simple value -  Speed -  Scale -  Simplicity •  Find simple values by key extremely fast Clarence user::1234 Melisa user::1235 Michael user::1236

Slide 16

Slide 16 text

Document Stores •  Key and a structured value (document) -  Speed -  Scale -  Flexibility •  Read/write ever-changing data about people, places, and things, at cloud-scale user::1234 { name: 'Frank', age: 37, kids: ['Sue', 'Ann', 'Bob'] } user::1235 { name: 'Carolyn', age: 56, kids: ['Tina'] } user::1236 { name: 'Tessa', age: 24}

Slide 17

Slide 17 text

Wide Column Stores •  Key and nested set of tuples -  Write vast volumes of data, with eventually consistent read access user::1234 name: text Frank age: number 37 kid: text Sue Ann Bob user::1235 name: text Carolyn age: number 56 kid: text Tina

Slide 18

Slide 18 text

Graph Databases •  Linked list of keyed objects -  Relationships •  Monitor complex, dynamically networked connections user:: 1234 Frank 37 Sue Ann Bob user:: 1235 Carolyn 56 Tina user:: 1236 Tessa 24

Slide 19

Slide 19 text

Polyglot Programming •  Enterprise will have a variety of different data storage technologies for different kinds of data •  We need to ask how we want to manipulate the data. This will help us figure out which persistence technologies are appropriate -  User Sessions: Couchbase (Memcached)/Redis -  Financial Data: RDBMS -  Shopping Cart: Riak/Couchbase (Memcached) -  Recommendation Systems: Neo4J -  Product Catalog: Couchbase/MongoDB -  Reporting: RDBMS/Couchbase Views -  Analytics: Couchbase/Cassandra

Slide 20

Slide 20 text

History of Couchbase NorthScale developed a key-value storage engine Apache CouchDB database project Membase and CouchOne joined forces in February 2011 to create Couchbase, the first and only provider of a comprehensive, end-to-end family of NoSQL database products

Slide 21

Slide 21 text

What is Couchbase Server? •  Couchbase Server •  Is a “document” database solution •  Has key/value based orientation •  Is geared for JSON •  Has no tables and no fixed schema •  Runs on a networked cluster of nodes •  Is highly scalable •  Is lightning fast read/write •  Has caching and persistence layers •  Automatically fails-over •  Couchbase Server is best suited for fast-changing data items of relatively small size

Slide 22

Slide 22 text

JavaScript Object Notation {            "firstName":  "Clarence",            "lastName":  "Tauro",            "age":  25,            "address":            {                    "streetAddress":  "21  2nd  Street",                    "city":  "Bangalore",                    "state":  "KA",                    "postalCode":  "560059"            },            "phoneNumber":            [                    {                        "type":  "home",                        "number":  "988  621-­‐7674"                    }            ]   }   JSON is a lightweight data-interchange format easy for humans to read and write

Slide 23

Slide 23 text

What is a Couchbase Document? {      "visibility":  "PRIVATE",      "name":  "Eclectic  Summer  Mix",      "userName":  "suzyqrocks",      "type":  "org.couchmusic.domain.Playlist",      "created":  1422138028037,      "updated":  1422138028072,      "tracks":  []   }   {      "id":  "playlist:12345",      "rev":  "1-­‐0004ebc0000000000",      "flags":  0,      "expiration":  0,      "type":  "json"   }   Document Content (Most recent in RAM and persisted to disk) Document Metadata (All keys unique and kept in RAM)

Slide 24

Slide 24 text

Couchbase Server Architecture

Slide 25

Slide 25 text

•  Technology Stack for Data Manager: ­  Couchbase Client SDK (“Smart Client”) ­  Client Query API1 and Query Engine (Views) ­  Cache Layer: RAM Cache ­  Persistence Layer: Couchbase Couchbase Server Architecture

Slide 26

Slide 26 text

•  Technology Stack for Cluster Manager: ­  Node Level – multiple vBuckets •  Default 1024 vBuckets/number of nodes ­  Cluster Level – multiple nodes (with 1 .. * buckets)1 ­  Datacenter Level – multiple clusters (optional XDCR)2 ­  Erlang (cluster management and process supervision)3 Couchbase Server Architecture

Slide 27

Slide 27 text

Anatomy of a Couchbase Application Couchbase Client Software Cluster Map NS Server EP Engine NS Server EP Engine NS Server EP Engine {Server List} 1. REST request 8091 5. Create, Read, Update and Delete Documents Becomes a Smart Client

Slide 28

Slide 28 text

3 3 2 Managed Cache Disk Queue Disk Replication Queue App Server Doc 1 Doc 1 Doc 1 To other node Single Node – Couchbase Write Operation Couchbase Server Node

Slide 29

Slide 29 text

3 3 2 Managed Cache Disk Queue Replication Queue App Server Doc 1’ Doc 1 Doc 1’ Doc 1 Doc 1’ Disk To other node Single Node – Couchbase Update Operation Couchbase Server Node

Slide 30

Slide 30 text

GET Doc 1 3 3 2 Disk Queue Replication Queue App Server Doc 1 Doc 1 Doc 1 Managed Cache Disk To other node Single Node – Couchbase Read Operation Couchbase Server Node

Slide 31

Slide 31 text

3 3 2 2 Disk Queue Replication Queue App Server Couchbase Server Node Doc 1 Doc 6 Doc 5 Doc 4 Doc 3 Doc 2 Doc 1 Doc 6 Doc 5 Doc 4 Doc 3 Doc 2 Managed Cache Disk To other node Single Node – Couchbase Cache Eviction

Slide 32

Slide 32 text

3 3 2 2 Disk Queue Replication Queue App Server Couchbase Server Node Doc 1 Doc 3 Doc 5 Doc 2 Doc 4 Doc 6 Doc 5 Doc 4 Doc 3 Doc 2 Doc 4 GET Doc 1 Doc 1 Doc 1 Managed Cache Disk To other node Single Node – Couchbase Cache Miss

Slide 33

Slide 33 text

Other Features of Couchbase 4.0 •  Multi-dimensional Scaling •  N1QL •  XDCR

Slide 34

Slide 34 text

Training Get Started with Couchbase Server 4.0: www.couchbase.com/beta Get Trained on Couchbase: http://training.couchbase.com CD220: Developing Couchbase NoSQL Applications Oct 20 – Oct 23 2015 CS300: Couchbase NoSQL Server Administration Nov 17 – Nov 20 Enroll Today!

Slide 35

Slide 35 text

Questions?