Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Data simplicity with NoSQL Databases

Data simplicity with NoSQL Databases

At the first forLoop Abuja developer event, Shuaib presented a talk based on his experience using NoSQL databases.

forLoop

June 28, 2016
Tweet

More Decks by forLoop

Other Decks in Programming

Transcript

  1. Shuaib Afegbua. Recovering software developer. Day Job: Information Systems Specialist

    @ Cambridge Education. Night Job: Chief Hacker, Code Kraft Twitter: @afegbuas
  2. Content • What is NoSQL? • Why NoSQL? • Types

    of NoSQL database • Example use cases • Data modelling • Explanation - sharding, replication, aggregation, performance, security • The SQL vs NoSQL myth • When to use NoSQL • Resource and References
  3. OK! What is NoSQL? - Not Only SQL - Generally

    refers to ‘anything’ outside the relational database paradigm - Gained traction in about 2006 with increase in volume of data think Amazon,Google, facebook
  4. SO! Now compare to before? Imagine the first release commercial

    of RDBMS - Oracle 1979. WTF! Shift towards digital economy powered by internet Large numbers of concurrent users Increase in volume of data IoT – internet is connecting everything Cloud based system Everything is going mobile
  5. SQL DATABASES • Tabular and structured storage • Pre defined

    schema • Use Joins to retrieve related data • Difficult and expensive to scale - scale vertically/up • Atomic transaction • Encourages normalisation of data • Enforce data integrity rules
  6. NoSQL DATABASES • JSON like storage, graphs, key-value pairs •

    Flexible/dynamic schema • Embedded documents over relationship/normalised data • Easy - scale horizontal/out • No Guarantee on Atomic transaction • Distributed • Mostly Open Source
  7. SO! Why NoSQL? ★ Handling huge volume of changing data

    ★ Need for Agile development sprints and multiple iteration ★ Operational Issues and Scaling - Scaling out architecture instead of expensive, monolithic architecture ★ Speed and the need to have 24x7 availability
  8. 3 Operational scaling Scaling out gives these benefits (a) deploying

    no more hardware than is required to meet the current load (b) leveraging less expensive hardware and/or cloud infrastructure; and (c) scaling on-demand and without downtime. Source: couchbase
  9. 4 Handling huge volume of changing data • 2.5 quintillion

    bytes of data daily (2.5 x 1018 or 2 trillion million) => 10 million blu-ray disc • Past 3 years has accounted for over 90% • Mostly unstructured and semi structured
  10. 5 24x7 availability needs single server/or as a cluster and

    the shared storage failure Photo credits: couchbase
  11. 5 24x7 availability needs single server/or as a cluster and

    the shared storage failure Photo credits: couchbase
  12. Who uses NoSQL • ME - CouchDB, MongoDB and falling

    in love OrientDB • Amazon - Dynamodb • Google - Big Table • Ebay/ US National Archives - Mongo • Facebook/Instagram/Apple/Netflix - Cassandra • Paypal/verizon/Tesco • Airbus - Oracle NoSQL • Twitter - Flockdb
  13. Categories of NoSQL DBs Key-Value store Document oriented Graph Database

    Wide column Redis Memcached RocksDB CouchDB Couchbase MongoDB OrientDB RethinkDB RavenDB Neo4J OrientDB BrightstarDB FlockBD Cassandra Habase Amazon SimpleDB DynamoDB - Multi Model Databases: OrientDB, ArangoDB etc - Special cases: Embedded (PouchDB $ LokiJs and Object databases (ObjectDB) Source: http://nosql-database.org
  14. How do you model and query your data? Schema-less? Idiomatic?

    Thrift/REST API? Relationships? References? SQL-like?
  15. Key-Value Stores Probably the simplest and most basic type. Store

    pairs of keys and values, as well as retrieve values when a key is known Speed Most in memory
  16. Redis - Key-Value Store “Redis is an open source, in-memory

    Data Structure Store, used as a database, a caching layer or a message broker.” - Redis website • Extremely fast • Strings or list storage • Persistence instead of memory • Replication • Multiple language clients • Supports transactions but no rollback Use cases: Cache systems, Queue, recent listing etc.
  17. Redis - resource https:/redis.io Redis labs - redislabs.com Building NoSQL

    Apps With Redis - https://www.pluralsight.com/courses/building-nosql- apps-redis Internet search/Youtube
  18. SQL to Mongo Mapping Chart SQL MongoDB database database table

    collection row document column field index index joins embedded/linking aggregation (e.g. group by) aggregation pipeline
  19. CouchDB - brief overview REST API Single document tant Simple

    and easy to use Super replication Low latency
  20. OrientDB - Graph databases Both document oriented and graph Dynamic

    schema Extended SQl syntax Inheritance and polymorphism Language API and REST
  21. OrientDB - Graph databases Stores data as nodes/vertex connected with

    links Both document oriented and graph Dynamic schema and extended SQl syntax Inheritance and polymorphism Language API and REST Multi-master Replication
  22. Wide Column - Cassandra • Columns are created for each

    row rather than being predefined by the table structure • Column Families => RDBMS Table/Keyspace => Database
  23. Wide Column oriented 1. Extremely powerful 2. Highly scalable 3.

    More of advanced key-value storage 4. Handle huge amount of information. 5. Increased granularity – being able to update an individual column 6. Cassandra uses SQL-like -> CQL
  24. Column oriented - Resource • CASSANDRA - planetcassandra.com • http://cassandra.apache.org/

    • Internet search/Youtube -> wide Column database • https://hbase.apache.org/ • https://academy.datastax.com/
  25. The SQL vs NoSQL Myth • Supersedes and better •

    They are opposite of the same coin • The language/framework determines the database most times
  26. Choosing NoSQL • Mobile backends • Internet of things •

    Single view/360 customer view situation • Content Management • Fraud Detection
  27. Choosing NoSQL • Profile management • catalog • Caching and

    Queueing • Real-Time Analytics • Social networks
  28. NoSQL Databases will evolve and get better. But they are

    suitable for ’specific cases’. NoSQL != Replace(SQL DB)
  29. Well! POSTGRES is the greatest This is a highly opinionated

    view from an aspiring SENIOR PROGRAMMER. NoSQL databases have their places and specific use cases.