Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Introduction to NoSQL Databases

Introduction to NoSQL Databases

A brief introduction to NoSQL Databases, what they are, how they differ from traditional databases. What are the different types of NoSQL DBs.

Bhaskar V. Karambelkar

June 28, 2015
Tweet

More Decks by Bhaskar V. Karambelkar

Other Decks in Technology

Transcript

  1. What  is  NOSQL  ? • NOSQL  is  not  a  standard.

    • NOSQL  does  not  mean  "No  SQL",  rather  “Not  Only  SQL” • But  is  also  not  a  RDBMS  replacement. • CAP  [Consistency  Availability  Partition  Tolerance]  Theorem • BASE  [  Basic  Availability,  Soft-­‐state,  Eventual  Consistency]  v/s  ACID Characteristics  of  a  NoSQL  Database • Flexible  schema  /  schema  less • Non  relational • Often  Distributed  (Partitioned) • Often  Replicated • Horizontally  Scalable • Eventually  consistent • Cheaper  compared  to  Big  names  RDBMS  systems • Simple  API  as  compared  to  SQL  (but  not  standard  across  products  or  even   versions).
  2. When  and  when  not  to  use  it? WHEN  /  WHY

     ? • When  traditional  RDBMS  model  is  too  restrictive  (flexible  schema) • When  ACID  support  is  not  "really"  needed • Object-­‐to-­‐Relational  (O/R)  impedance • Because  RDBMS  is  neither  distributed  nor  scalable  by  nature • Logging  data  from  distributed  sources • Storing  Events  /  temporal  data • Temporary  Data  (Shopping  Carts  /  Wish  lists  /  Session  Data) • Data  which  requires  flexible  schema • Polyglot  Persistence i.e.  best  data  store  depending  on  nature  of  data. WHEN  NOT  ? • Financial  Data • Data  requiring  strict  ACID  compliance • Business  Critical  Data
  3. Compared  to  RDBMS Pros • Flexible  schema • simple  API

    • Highly  parallel • Linearly  scalable • Super  fast  reads  and  writes • Distributed  and  Replicated   storage • Cheap  ($$$) Cons • No  standards • Requires  Paradigm  shift • Poor  SQL  support • Normally  not  ACID   compliant • Eventually  Consistent
  4. Different  types  of  NoSQL  Databases Key/value • key/value  store •

    can  be  in  memory  only,  or  backed  by  disk  persistence. • supports  versioning • e.g.  Voldemort  (LinkedIn),  Amazon  SimpleDB,  Memcache,   BerkeleyDB,  Oracle  NoSQL Document • similar  to  KV,  except  value  is  is  a  document. • documents  are  JSON/BSON  encoded  data. • e.g.  Couchbase,  MongoDB,  RavenDB,  ArangoDB,  MarkLogic,   OrientDB,  RavenDB,  Redis,  RethinkDB
  5. Different  types  of  NoSQL  Databases    cont. Column • Multiple

     columns  (values)  per  key. • e.g.  Cassandra,  Hbase,  Amazon  Redshift,  HP  Vertica,  Teradata Graph • For  modeling  the  structure  of  Data • Uses  Property  Graph  Data  Model  (Nodes,  Relationships,   properties) • e.g.  Neo4j,  InfiniteGraph,  OrientDB,  Titan  GraphDB Other  Types  /  Special  Purpose • Search  DBs  Solr,  Elasticsearch • Object  Databases   • XML  Databases
  6. BerkleyDB KV  Database • Perhaps  the  oldest  NOSQL  Database •

    Embeddable  in  applications  (C/C++/Java) • Supports  Transactions,  Replication • Maintained  and  Licensed  by  Oracle  Corp. • Used  as  a  backing  store  for  many  applications. • http://www.oracle.com/technetwork/database/database-­‐ technologies/berkeleydb/overview/index.html
  7. Redis Document  Database • Often  referred  to  as  a  Data

     Structure  Server • Supports  storing  strings, hashes, lists, sets,sorted sets   , bitmaps and hyperloglogs. • Data  is  kept  in  Memory • Extremely  popular  for  short  lived  data  (Session,  cache) • Can  be  used  as  a  Push/Pull  Message  Queue • http://redis.io/documentation
  8. MongoDB Document  Database • Another  popular  Document  Database • Data

     is  stored  on  Disks  but  cached  in  memory  for  speed • Supports  Replication  and  Partitioning  (Sharding) • Very  popular  in  Web  Applications • Data  is  stored  internally  as  BSON  and  exchanged  with   applications  as  JSON. • Very  easy  to  setup  and  get  started. • Not  open-­‐source  but  free  to  use  (even  commercially)  and   support  license  option. • http://docs.mongodb.org/
  9. Cassandra  Column  Database • Data  is  stored  column  wise  as

     opposed  to  row-­‐wise • Supports  partitioning  (sharding)  and  replication  even  across  data   centers. • Can  be  used  to  store  >  Petabytes  of  data. • Supports  SQL  like  CQL  interface. • Open-­‐source  but  commercially  supported  by  DataStax. • https://cassandra.apache.org/
  10. Elasticsearch Search  Database • Full  text  search  engine  based  on

     the  Apache  Lucene full  text  search   library. • NoSQL  Data  Store  for  Structured/UnStructured data • Open  Source  but  commercially  supported  by  http://elastic.co/ ,  and   part  of  the  ELK  (Elasticsearch-­‐LogStash-­‐Kibana)  product  stack  . • Full  text  search  as  well  as  structured  query  support. • All  interaction  via  REST  APIs  (API  Bindings  available  for  all  major   languages) • Support  fault-­‐tolerant  and  automatic  fail-­‐over  operations,  as  well  as   data  replication  out  of  box.
  11. END. Contact  Info. Email  :  bhaskarvk AT  gmail DOT  com

    LinkedIN :  https://www.linkedin.com/in/bhaskarvk