A Comparative Analysis of Different NoSQL Databases on Data Model, Query Model and Replication Model

International Conference on Emerging Research in Computing, Information, Communication and
Applications, ERCICA 2013

A COMPARATIVE ANALYSIS OF DIFFERENT NOSQL DATABASES ON DATA MODEL,
QUERY MODEL AND REPLICATION MODEL By BASAWANTH RAO PRASHANTH K R Centre for Research Christ University, Hosur Road, Bangalore & CLARENCE J M TAURO Centre for Research Christ University, Hosur Road, Bangalore

Objectives • Introduction to NoSQL • Need for the study
• Are ACID Properties always desirable? • Basically available, Soft state, Eventually consistent (BASE) • The CAP Theorem • Motives of NoSQL Practitioner • Aim of the study • Validation procedure • Findings • Conclusion

Introduction RDBMS - predominant technology for storing structured data in
web and business applications “one size fits all” - thinking concerning data-stores has been questioned Apply NoSQL databases for the persistence layer of a collaborative web application

Need for the Study • What is the problem of
traditional databases?

Need for the Study - ACID Properties • ATOMICITY: All
of nothing • CONSISTENCY: Any transaction will take the database from one consistent state to another, with no broken constraints (referential integrity) • ISOLATION: Other operations cannot access the data that has been modified during a transaction that has not been completed • DURABILITY: Ability to recover the committed transaction updates against any kind of system failure

Are ACID Properties always desirable? • … But what about:
– Latency – Partition Tolerance – High Availability

basically available, soft state, eventually consistent

The CAP Theorem

Choose Any TWO

Motives of NoSQL Practitioner • Avoidance of Unneeded Complexity •
High Throughput • Horizontal Scalability and Running on Commodity Hardware • Avoidance of Expensive Object-Relational Mapping • Complexity and Cost of Setting up Database Clusters • Compromising Reliability for Better Performance • The Current “One size fit’s it all” Databases Thinking Was and Is Wrong

Aim of the Study “To study and apply the available
systems of non- relational databases to persist objects in order to obtain a more specific knowledge about the broad range of existing technologies”

Operational Definitions Used • NoSQL • Object Persistence • The
CAP Theorem • Multi-Version Concurrency Control (MVCC)

Objectives of the Study [Other] • Analyze various non-relational databases
• Analyze few selected databases and to test by benchmarking and categorizing their performance • Develop a framework in order to assist the creation and execution of the benchmarks • Develop prototypes by making use of NoSQL technologies • Apply the developed prototypes for comparing the NoSQL databases with the traditional solutions of relational databases

Literature Survey Survey on various NoSQL databases and their capabilities
Analysis of data model, query model, replication model and consistency model. Persistence Layer, Model Driven Development, Object Notations (Not discussed in the presentation)

Factors/Variables of the Study • The following are the factors
considered for the study of various NoSQL Databases – Data Model – Query Model – Replication Model – Consistency Model – Sharding

Factors of the Study for Benchmarking • The following are
the factors considered while benchmarking NoSQL Databases. – Raw Performance – Scalability – Elasticity – Read/Write Operations

NoSQL Databases Used for the Study

Validation Procedure • Comparison of the sorting capabilities of the
examined NoSQL databases • Comparison of the range querying capabilities of the examined NoSQL databases • Comparison of the aggregation functionalities • Comparison of the durability properties • The performance of the MongoDB store is compared against the stores for MySQL and the in memory version of HSQL CASE STUDY

Findings - 1 MySQL HSQL MongoDB Total time spent for
DB operation 30min 51s 41s Slowest operation with Avg. Time Writing one object into wiki page in 202ms Getting one object from job in3ms Getting into an object from a job in 11ms Avg. time for writing one object into wiki page 102ms 2ms 1ms Avg. Time for getting one object from the wiki page 2ms 1ms 2ms Avg. Time for slowest count operation 2ms 1ms 3ms Avg. Time of the five slowest queries in milliseconds 13,5,5,4,2 3,3,1,1,1 2,1,1,1,1

Findings - 2 MySQL HSQL MongoDB Time spent for db
operations. 140.625M 96.458333M 39.5M Average times of the five slowest operations in seconds 2.85, 2.225 , 1.9375, 1.9075, 1.445 3.6525,1.9425, 1.620, 1.3675,1.7522 3.265,0.52 , 0.355, 0.3475, 0.2725 The five greatest time in Minutes 47.375, 20.5833, 16.8333, 12.375, 5.6666 40.4166, 11.25, 8.70833, 7.75,5.0416 18.0416, 6.25, 4.375, 3.75, 2.3333

Conclusion • The knowledge acquired and developed by this study
can contribute to the development of a systematic approach for solving problems of data persistence with an alternative non-relational database • The careful examination of NoSQL databases and their application creates a common set of design patterns that may be reused when modeling data and designing a database

Future Work • This project focused mainly on the most
used NoSQL databases. – Still, a large number of NoSQL databases exists; each building on different aspects that could also be studied • The benchmarks could also be extended in order to include different workload scenarios – by varying the percentage of read and write operations but also by using different distributions for selecting objects other than the uniform distribution • Repeat the benchmarks using an infrastructure to that used in a production environment

Questions?

A Comparative Analysis of Different NoSQL Datab...

A Comparative Analysis of Different NoSQL Databases on Data Model, Query Model and Replication Model

Clarence J M Tauro (Couchbase)

More Decks by Clarence J M Tauro (Couchbase)

Other Decks in Research

Featured

Transcript

International Conference on Emerging Research in Computing, Information, Communication and