• NOSQL does not mean "No SQL", rather “Not Only SQL” • But is also not a RDBMS replacement. • CAP [Consistency Availability Partition Tolerance] Theorem • BASE [ Basic Availability, Soft-‐state, Eventual Consistency] v/s ACID Characteristics of a NoSQL Database • Flexible schema / schema less • Non relational • Often Distributed (Partitioned) • Often Replicated • Horizontally Scalable • Eventually consistent • Cheaper compared to Big names RDBMS systems • Simple API as compared to SQL (but not standard across products or even versions).
? • When traditional RDBMS model is too restrictive (flexible schema) • When ACID support is not "really" needed • Object-‐to-‐Relational (O/R) impedance • Because RDBMS is neither distributed nor scalable by nature • Logging data from distributed sources • Storing Events / temporal data • Temporary Data (Shopping Carts / Wish lists / Session Data) • Data which requires flexible schema • Polyglot Persistence i.e. best data store depending on nature of data. WHEN NOT ? • Financial Data • Data requiring strict ACID compliance • Business Critical Data
• Highly parallel • Linearly scalable • Super fast reads and writes • Distributed and Replicated storage • Cheap ($$$) Cons • No standards • Requires Paradigm shift • Poor SQL support • Normally not ACID compliant • Eventually Consistent
can be in memory only, or backed by disk persistence. • supports versioning • e.g. Voldemort (LinkedIn), Amazon SimpleDB, Memcache, BerkeleyDB, Oracle NoSQL Document • similar to KV, except value is is a document. • documents are JSON/BSON encoded data. • e.g. Couchbase, MongoDB, RavenDB, ArangoDB, MarkLogic, OrientDB, RavenDB, Redis, RethinkDB
columns (values) per key. • e.g. Cassandra, Hbase, Amazon Redshift, HP Vertica, Teradata Graph • For modeling the structure of Data • Uses Property Graph Data Model (Nodes, Relationships, properties) • e.g. Neo4j, InfiniteGraph, OrientDB, Titan GraphDB Other Types / Special Purpose • Search DBs Solr, Elasticsearch • Object Databases • XML Databases
Embeddable in applications (C/C++/Java) • Supports Transactions, Replication • Maintained and Licensed by Oracle Corp. • Used as a backing store for many applications. • http://www.oracle.com/technetwork/database/database-‐ technologies/berkeleydb/overview/index.html
Structure Server • Supports storing strings, hashes, lists, sets,sorted sets , bitmaps and hyperloglogs. • Data is kept in Memory • Extremely popular for short lived data (Session, cache) • Can be used as a Push/Pull Message Queue • http://redis.io/documentation
is stored on Disks but cached in memory for speed • Supports Replication and Partitioning (Sharding) • Very popular in Web Applications • Data is stored internally as BSON and exchanged with applications as JSON. • Very easy to setup and get started. • Not open-‐source but free to use (even commercially) and support license option. • http://docs.mongodb.org/
opposed to row-‐wise • Supports partitioning (sharding) and replication even across data centers. • Can be used to store > Petabytes of data. • Supports SQL like CQL interface. • Open-‐source but commercially supported by DataStax. • https://cassandra.apache.org/
the Apache Lucene full text search library. • NoSQL Data Store for Structured/UnStructured data • Open Source but commercially supported by http://elastic.co/ , and part of the ELK (Elasticsearch-‐LogStash-‐Kibana) product stack . • Full text search as well as structured query support. • All interaction via REST APIs (API Bindings available for all major languages) • Support fault-‐tolerant and automatic fail-‐over operations, as well as data replication out of box.