company > 4x growth in 2013; Widely recognized as industry leader Worldwide opera9ons with > 150 employees >75 in engineering; more than doubling every year Support offices in US (mul2ple), UK, India, Japan, China (soon) Leader in scalability & high performance Easily & reliable scale your app; Get consistent low latency & high throughput Only NoSQL vendor with mobile database + sync Recognized as most innova9ve NoSQL vendor Provide KV & document database for web & mobile apps Flexible, schemaless JSON data model; 100% open source Mission cri9cal deployments at large enterprises & internet companies > 400 customers; >10,000 paid produc2on nodes deployed; worldwide customer base Couchbase, Inc. Confiden2al
Overview Couchbase offers a full range of Data Management solu9ons High Availability Cache Key Value Document Mobile device SSN: 400 658 9993 Pass: ****** Pass: ******
NoSQL Database Considera9ons Easy Scalability Consistent High Performance Flexible Data Model Always On 24x7x365 Grow cluster without applica2on changes, without down2me when needed Always awesome experience for your applica2on users The sun never sets on the Internet, your applica2on needs the database to always serve data Keep developers produc2ve and allow fast and easy addi2on of new features JSON JSON JSON JSON JSON PERFORMANCE
Couchbase Server Is The Complete Solu9on One click scalability and no app changes. Sub millisecond latency with high throughput for reads and writes. Maintenance, upgrades and cluster resizing all online without applica2on down2me JSON document model with no fixed schema. ✔ ✔ ✔ ✔ Consistent High Performance Flexible Data Model Easy Scalability Always On 24x7x365
Single Node Type No Manual Sharding Database manages data movement to scale out -‐ Not the user Database handles propaga2on of updates to scale across clusters and geos Provides disaster recover / data locality Hugely simplifies management of clusters Easy to scale clusters by adding any # of nodes FEATURES Auto Sharding Couchbase, Inc. Confiden2al
Fine Grained Locking Built-‐in Cache Hash Par99oning Allows high concurrency and in turn high throughput via highly granular latches No need of separate cache layer Database manages ac2vely used data Uniform data distribu2on Uniform load distribu2on – NO hotspots PERFORMANCE Support a large number of users needed for interac2ve apps Massive Concurrent Connec9ons FEATURES Couchbase, Inc. Confiden2al
All admin opera2ons online • Compac2on • Indexing • Rebalance • Backup & Restore • High availability using in-‐memory replica2on • Auto or manual failover • XDCR for disaster recovery Online administra9ve opera9ons HA via Replica9on DR via XDCR FEATURES Online DB upgrades and HW maintenance Op2mized swap opera2on to replace nodes Online DB upgrades and maintenance Couchbase, Inc. Confiden2al
Represent data as objects instead of shredding into rows and columns Create indexes on any akribute of the document Each document can have a different structure Easy to change data without database changes and down2me Maintains Na9ve object representa9on Handles constantly changing data JSON JSON JSON JSON JSON FEATURES Data with mixed structure beker managed via JSON in a document DB than an RDBMS Schema-‐less for structured / un/ semi-‐ structured data Couchbase, Inc. Confiden2al
Couchbase Server • JSON on the device Developers increasingly prefer NoSQL database • JSON on the wire No need for data transforma2on • JSON in the cloud Flexible data model High performance Easy scalability Server Sync Gateway Lite JS N JS N JS N
#1 mission-‐cri2cal centralized service Cloud, music, and app services 800M+ profiles • Replacement of Oracle Streams • Evaluated Cassandra and mongoDB • Chose Couchbase for ability to scale to mul9ple DCs, performance, and replica9on latency • Deployed Couchbase in produc9on since September 2013 • 240 Server nodes, 3 datacenters, 200K reads/sec, 20K writes/sec • DBA core team deemed Couchbase reliable easy to maintain Top Tier Consumer Electronics Company
device mgmt store #1 over-‐the-‐top communica2on plaqorm 200M+ users worldwide • Wholesale replacement of mongoDB • Chose Couchbase for ability to scalability and performance • Deployed Couchbase in produc9on since August • 70+ Server nodes • Trillions of requests per-‐day with peaks of 120k opera9ons per second, 80/20 reads to writes • Reference customer
Analy)cs + • Company Uses Big Data to analyze online consumer behavior • Scalability and Performance Requirements 62TB of network traffic/day 11 Billion database transac2ons per day 6.1m user connec2ons Exis2ng Database Infrastructure SQL Server and custom code • Pain Scaleout – Complex and 2meintensive Performance – Queries take 12 Hours Hardware Cost – Not affordable, 1.2TB Fusion I/O card per server • Couchbase Benefits Scalability – Simple scaleout in minutes not hours up to 1.5m opera2ons/sec Performance – Queries 12 hours to 5 minutes Drama2c Cost Savings – No requirement for Fusion I/O, cheap commodity server hardware Availability – Five nines
Targe)ng & Real-‐Time Analy)cs • Company Global Leader in Online Payments 132m Ac2ve Accounts, 193 Markets, 25 Currencies • Scalability and Performance Requirements 300m to 1bn documents with 3 Tb to 10TB Billions of requests and sub 200ms response 2mes access to JSON documents Read/write mix 50/50 with 5ms latency • Exis9ng Database Infrastructure Mul2ple Tiers – Separate caching and durable store MySQL, Oracle, Terracoka, Coherence • Pain Real-‐Time Access to Iden2ty Mapping – eBay ID, PayPal ID, Social ID, 3rd Party ID, Email Performance – Ad needs to be served in 200ms Cost – Mul2ple 2ers for caching and durability Highly Available – Across large clusters and across data centers • Couchbase Benefits Performance – Reduced latency with 5ms access 2mes Cost – Consolida2on of database and cache layers Cross Data Center Availability + + +
• Company Leading cloud company – allows enterprises to connect in real-‐2me with their customers via chat, voice, and content delivery • Scalability and Performance Requirements 13TB/Month 20m engagements/month 1.8bn sessions/month • Exis9ng Database Infrastructure MySQL • Pain Scalability Performance – Batch analy2cs and real-‐2me access to customer profiles Cross Data Center Replica2on – 4 data centers • Couchbase Benefits Scalability Performance – Mixed read/ write with very high throughput Document Store – Ease of Development +
Analy)cs • Company McGraw-‐Hill Educa2on Labs: A Self Adap2ng, Interac2ve Learning Portal • Applica9on Requirements An interac2ve learning environment that scales to millions of learners Serves MHE as well as third party content Self=adapts via usage data • Experimented with other types of database Infrastructure XML databases SQL/MR Engines In-‐memory data grids Enterprise search servers • Pain None allowed for elas2c scaling under spike periods Couldn’t catalog & deliver content from many sources Needed consistent low-‐latency for metadata and stats access Needed full-‐text search support for content discovery • Couchbase Benefits Scalability: simple scaleout to support 11 million users producing 4 petabytes of digital content data. Persistence Low latency: sub millisecond response 2mes Front end Middleware Back-‐end
Couchbase Development Simple and Flexible – Document Based 1. Retrieve the document that represents the user, all user informa2on aggregated together (python example shown) 2. Update the document 3. Store the document back to the cluster No need to shard data at the applica0on level! Built-‐in concurrency controls for easily scaling the app to many users!
Auto Sharding and Cluster Map Hash func9on (KEY) vB1 vB2 vB3 vB4 vB5 vB6 Physical servers A B C More scalability required Add node Logical Par99ons Cluster Map New Cluster Map
3 3 2 Single node – Couchbase Write Opera9on Managed Cache Disk Queue Disk Replica2on Queue App Server Couchbase Server Node To other node Doc 1 Doc 1 Doc 1
3 3 2 Single node – Couchbase Update Opera9on Managed Cache Disk Queue Disk Replica2on Queue App Server Couchbase Server Node To other node Doc 1 Doc 1 Doc 1 Doc 1 Doc 1
3 3 2 Single node – Couchbase Read Opera9on Managed Cache Disk Queue Disk Replica2on Queue App Server Couchbase Server Node To other node Doc 1 Get Doc 1 Doc 1 Doc 1
XDCR: Cross Data Center Replica9on US DATA CENTER hkp://blog.groosy.com/wp-‐content/uploads/2011/10/internet-‐map.jpg EURPOE DATA CENTER ASIA DATA CENTER
ACTIVE SERVER 1 RAM DISK Doc Doc 2 Doc 9 Doc Doc Doc ACTIVE SERVER 2 RAM DISK Doc Doc Doc Doc Doc Doc ACTIVE SERVER 3 RAM DISK Doc Doc Doc Doc Doc Doc Cross Data Center Replica9on (XDCR) COUCHBASE SERVER CLUSTER NYC DATA CENTER COUCHBASE SERVER CLUSTER SF DATA CENTER ACTIVE SERVER 1 RAM DISK Doc Doc 2 Doc 9 Doc Doc Doc ACTIVE SERVER 2 RAM DISK Doc Doc Doc Doc Doc Doc ACTIVE SERVER 3 RAM DISK Doc Doc Doc Doc Doc Doc { } { } { } { } { } { } { } { } { } { } { } { } { }
Indexing and Querying Features • Index and Query Distributed indexing and querying Secondary indexes of JSON document content Flexible querying of indexes • Incremental Map-‐Reduce Distributed simple real-‐2me analy2cs Only considers changes due to updated data • Full Text Search Robust integra2on with Elas2cSearch cluster Flexible full text search and faceted search
Couchbase: The Complete NoSQL Solu9on Easy Scalability Flexible Data Model Always On 24x7x365 Grow cluster without applica2on changes, without down2me when needed Always awesome experience for your applica2on users. The sun never sets on the Internet, your applica2on needs the database to always serve data. Keep developers produc2ve and allow fast and easy addi2on of new features JSON JSON JSON JSON JSON PERFORMANCE Consistent High Performance
Consistent High Performance • Consistent, predictable sub millisecond latency Apps need fast, predictable access to data, it’s not good enough to be fast some of the 2me • Consistent, predictable throughput Throughput capacity of your data layer should be independent of the mix of reads and writes • Linear throughput scalability Double the number of servers, get twice the maximum throughput and double the data capacity
YCSB Benchmark Details • Web applica9on simula9on Simulates changing set of users using the app/accessing data Document size of 1.5-‐2K with 15 million ac2ve documents • Data doesn’t en2rely fit into RAM Workload simula2ng realis2c mix of reads and writes (60/40) • System details 4 node cluster with seperate node to run client workload AWS Extra Large instances with striped EBS 1 replica setup (For mongoDB -‐ no write concern, no journaling) Each test run 3 2mes with varying throughputs with 95% latency measured • YCSB test workload source code hkps://github.com/Altoros/YCSB
Couchbase Server vs. MongoDB Easy Scalability Consistent, High Performance Flexible Data Model Always On 24x7x365 Consistent sub millisecond reads/writes; Consistent high throughput No down2me for sowware upgrades, hardware maintenance, etc. Schemaless data model for rapid development With 1-‐click, horizontally grow cluster, even scale across datacenters High & Inconsistent latency; Lower throughput Schemaless data model for rapid development Difficult online upgrade; Not all maintenance is online Complex mul2-‐step scaling, no write scaling across data centers ✔ ✖ ✔ ✔ ✔ ✔ ✖ ✖ ✔
Consistent Lower Latencies and Higher Throughput • Couchbase High read and write throughput and consistent low latencies Write performance advantage due to low granularity of Couchbase memory locking mechanism and minimal conten2on Read opera2ons happen concurrently with, and independently of writes • MongoDB Severely limited write throughput due to very coarse write locks that limit concurrency at the node level Inconsistent latencies and throughput as all reads need to wait for a write to finish. • Cost Considera9ons
Couchbase Server vs. Cassandra Easy Scalability Consistent, High Performance Flexible Data Model Always On 24x7x365 Consistent sub-‐millisecond reads/writes and high throughput No down2me for sowware upgrades, hardware maintenance, etc. Schemaless data model for rapid development With 1-‐click, horizontally grow cluster, even scale across datacenters High and inconsistent latency; medium throughput Very complex columnar data model Online upgrades and online maintenance Complex mul2-‐step scaling, coarse grain growth recommended ✔ ✔ ✔ ✔ ✖ ✖ ✖ ✔
Challenges with a Memcached Tier Problem Symptoms Couchbase Solu9on Cold Cache Slowdown or collapse of the data service layer due to heavily overloaded RDBMS when memcached nodes go down (on failure or for maintenance) Data is automa2cally replicated across the Couchbase cluster, providing high availability of data even on failures Heavy RDBMS Conten9on Mul2ple requests for data items that do not exist in the cache results in sudden shiwing of load to the rela2onal database causing heavy conten2on By replica2ng data across the cluster, Couchbase Server provides consistent performance without shiwing load to the RDBMS layer Lack of Scalability Adding or removing memcached nodes is complicated and causes unpredictable applica2on performance degrada2on Auto-‐sharding and online rebalancing in Couchbase Server provides easy non-‐ disrup2ve expansion of the cluster Complex Monitoring Management of individual memcached nodes increases the complexity of opera2ons and lacks a single consistent view of the caching layer Couchbase Server provides an in-‐built admin console for cluster wide management and monitoring as well as RESTful APIs for easy automa2on and third-‐party integra2on
Memcached Tier Replacement: How it Works • Fully memcached protocol compa9ble • Easy to replace a 9er of individual memcached servers with a Couchbase Server cluster • The cluster receives reads and writes, keeps frequently accessed items in memory, persists and shards and replicates the data amongst the cluster • Reads and writes are s9ll as low latency and high throughput as memcached • User gets all the scalability and high-‐availability advantages of a Couchbase Server cluster