Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Data simplicity with NoSQL Databases

Data simplicity with NoSQL Databases

At the first forLoop Abuja developer event, Shuaib presented a talk based on his experience using NoSQL databases.

Avatar for forLoop

forLoop

June 28, 2016
Tweet

More Decks by forLoop

Other Decks in Programming

Transcript

  1. Shuaib Afegbua. Recovering software developer. Day Job: Information Systems Specialist

    @ Cambridge Education. Night Job: Chief Hacker, Code Kraft Twitter: @afegbuas
  2. Content • What is NoSQL? • Why NoSQL? • Types

    of NoSQL database • Example use cases • Data modelling • Explanation - sharding, replication, aggregation, performance, security • The SQL vs NoSQL myth • When to use NoSQL • Resource and References
  3. OK! What is NoSQL? - Not Only SQL - Generally

    refers to ‘anything’ outside the relational database paradigm - Gained traction in about 2006 with increase in volume of data think Amazon,Google, facebook
  4. SO! Now compare to before? Imagine the first release commercial

    of RDBMS - Oracle 1979. WTF! Shift towards digital economy powered by internet Large numbers of concurrent users Increase in volume of data IoT – internet is connecting everything Cloud based system Everything is going mobile
  5. SQL DATABASES • Tabular and structured storage • Pre defined

    schema • Use Joins to retrieve related data • Difficult and expensive to scale - scale vertically/up • Atomic transaction • Encourages normalisation of data • Enforce data integrity rules
  6. NoSQL DATABASES • JSON like storage, graphs, key-value pairs •

    Flexible/dynamic schema • Embedded documents over relationship/normalised data • Easy - scale horizontal/out • No Guarantee on Atomic transaction • Distributed • Mostly Open Source
  7. SO! Why NoSQL? ★ Handling huge volume of changing data

    ★ Need for Agile development sprints and multiple iteration ★ Operational Issues and Scaling - Scaling out architecture instead of expensive, monolithic architecture ★ Speed and the need to have 24x7 availability
  8. 3 Operational scaling Scaling out gives these benefits (a) deploying

    no more hardware than is required to meet the current load (b) leveraging less expensive hardware and/or cloud infrastructure; and (c) scaling on-demand and without downtime. Source: couchbase
  9. 4 Handling huge volume of changing data • 2.5 quintillion

    bytes of data daily (2.5 x 1018 or 2 trillion million) => 10 million blu-ray disc • Past 3 years has accounted for over 90% • Mostly unstructured and semi structured
  10. 5 24x7 availability needs single server/or as a cluster and

    the shared storage failure Photo credits: couchbase
  11. 5 24x7 availability needs single server/or as a cluster and

    the shared storage failure Photo credits: couchbase
  12. Who uses NoSQL • ME - CouchDB, MongoDB and falling

    in love OrientDB • Amazon - Dynamodb • Google - Big Table • Ebay/ US National Archives - Mongo • Facebook/Instagram/Apple/Netflix - Cassandra • Paypal/verizon/Tesco • Airbus - Oracle NoSQL • Twitter - Flockdb
  13. Categories of NoSQL DBs Key-Value store Document oriented Graph Database

    Wide column Redis Memcached RocksDB CouchDB Couchbase MongoDB OrientDB RethinkDB RavenDB Neo4J OrientDB BrightstarDB FlockBD Cassandra Habase Amazon SimpleDB DynamoDB - Multi Model Databases: OrientDB, ArangoDB etc - Special cases: Embedded (PouchDB $ LokiJs and Object databases (ObjectDB) Source: http://nosql-database.org
  14. How do you model and query your data? Schema-less? Idiomatic?

    Thrift/REST API? Relationships? References? SQL-like?
  15. Key-Value Stores Probably the simplest and most basic type. Store

    pairs of keys and values, as well as retrieve values when a key is known Speed Most in memory
  16. Redis - Key-Value Store “Redis is an open source, in-memory

    Data Structure Store, used as a database, a caching layer or a message broker.” - Redis website • Extremely fast • Strings or list storage • Persistence instead of memory • Replication • Multiple language clients • Supports transactions but no rollback Use cases: Cache systems, Queue, recent listing etc.
  17. Redis - resource https:/redis.io Redis labs - redislabs.com Building NoSQL

    Apps With Redis - https://www.pluralsight.com/courses/building-nosql- apps-redis Internet search/Youtube
  18. SQL to Mongo Mapping Chart SQL MongoDB database database table

    collection row document column field index index joins embedded/linking aggregation (e.g. group by) aggregation pipeline
  19. CouchDB - brief overview REST API Single document tant Simple

    and easy to use Super replication Low latency
  20. OrientDB - Graph databases Both document oriented and graph Dynamic

    schema Extended SQl syntax Inheritance and polymorphism Language API and REST
  21. OrientDB - Graph databases Stores data as nodes/vertex connected with

    links Both document oriented and graph Dynamic schema and extended SQl syntax Inheritance and polymorphism Language API and REST Multi-master Replication
  22. Wide Column - Cassandra • Columns are created for each

    row rather than being predefined by the table structure • Column Families => RDBMS Table/Keyspace => Database
  23. Wide Column oriented 1. Extremely powerful 2. Highly scalable 3.

    More of advanced key-value storage 4. Handle huge amount of information. 5. Increased granularity – being able to update an individual column 6. Cassandra uses SQL-like -> CQL
  24. Column oriented - Resource • CASSANDRA - planetcassandra.com • http://cassandra.apache.org/

    • Internet search/Youtube -> wide Column database • https://hbase.apache.org/ • https://academy.datastax.com/
  25. The SQL vs NoSQL Myth • Supersedes and better •

    They are opposite of the same coin • The language/framework determines the database most times
  26. Choosing NoSQL • Mobile backends • Internet of things •

    Single view/360 customer view situation • Content Management • Fraud Detection
  27. Choosing NoSQL • Profile management • catalog • Caching and

    Queueing • Real-Time Analytics • Social networks
  28. NoSQL Databases will evolve and get better. But they are

    suitable for ’specific cases’. NoSQL != Replace(SQL DB)
  29. Well! POSTGRES is the greatest This is a highly opinionated

    view from an aspiring SENIOR PROGRAMMER. NoSQL databases have their places and specific use cases.