Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Databases: Beyond the SQL/NoSQL Debate

Geoffrey Litt
September 28, 2013

Databases: Beyond the SQL/NoSQL Debate

Seth Thompson and I gave this talk at the first HackNY Masters mini-conference, held at the HackNY Fall 2013 hackathon.

Geoffrey Litt

September 28, 2013
Tweet

More Decks by Geoffrey Litt

Other Decks in Programming

Transcript

  1. Is there a trendy new database that everyone’s talking about

    on Hacker News? Use the trendy new database! Yes Use whatever I used last time, I guess. No
  2. What are different ways to organize data? SQL Relational Database

    Management System (RDBMS) PostgreSQL MySQL Oracle SQL Server DB2 NoSQL Document Store MongoDB CouchDB Key-Value Store Redis Cassandra Graph Database Distributed Analytics Etc.
  3. Fundamental Concepts of Databases Transaction a unit of work performed

    within a database management system Table / Collection a grouping of records Tuple / Row / Document a single record of related data Schema a definition of attributes and relations, imposed on records ACID a set of properties that guarantee that transactions are processed reliably
  4. Atomicity (ACID) Every transaction is all or nothing If one

    part of the transaction fails, the entire transaction fails account_1 $10 account_2 $0 Transaction to transfer $10 between two accounts
  5. Atomicity (ACID) Every transaction is all or nothing If one

    part of the transaction fails, the entire transaction fails account_1 $0 account_2 $0 Transaction to transfer $10 between two accounts
  6. Consistency (ACID) Every transaction brings the database from one valid

    state to another phone_numbers 0123456789 9876543210 01234567890 Inconsistent insertion in column with custom data type checking
  7. Isolation (ACID) No transaction can interfere with another In other

    words, the effects of concurrent transactions should be the same as if the transactions were run serially in arbitrary order account_1 $10 Process 1 val = account_1 val += $10 account_1 = val Process 2 val = account_1 val += $10 account_1 = val
  8. Durability (ACID) Once a transaction has been committed it will

    remain so, even in the event of power loss or system failure
  9. PostgreSQL: Atomicity & Durability first_name last_name phone_number xnum Seth Thompson

    1111111111 2384 Geoffrey Litt 2222222222 3859 contacts table transactions table xnum status
  10. PostgreSQL: Atomicity & Durability contacts table transactions table xnum status

    9001 in_progress first_name last_name phone_number xnum Seth Thompson 1111111111 2384 Geoffrey Litt 2222222222 3859
  11. PostgreSQL: Atomicity & Durability contacts table transactions table xnum status

    9001 in_progress first_name last_name phone_number xnum Seth Thompson 1111111111 2384 Geoffrey Litt 2222222222 3859 Ramsey Nasser 3333333333 9001
  12. PostgreSQL: Atomicity & Durability contacts table transactions table xnum status

    9001 in_progress first_name last_name phone_number xnum Seth Thompson 1111111111 2384 Geoffrey Litt 2222222222 3859 Ramsey Nasser 3333333333 9001 Alex Qin 4444444444 9001
  13. PostgreSQL: Atomicity & Durability contacts table transactions table xnum status

    9001 in_progress first_name last_name phone_number xnum Seth Thompson 1111111111 2384 Geoffrey Litt 2222222222 3859 Ramsey Nasser 3333333333 9001 Alex Qin 4444444444 9001 Omar Omran 5555555555 9001
  14. PostgreSQL: Atomicity & Durability contacts table transactions table xnum status

    9001 in_progress first_name last_name phone_number xnum Seth Thompson 1111111111 2384 Geoffrey Litt 2222222222 3859 Ramsey Nasser 3333333333 9001 Alex Qin 4444444444 9001 Omar Omran 5555555555 9001 All writes persisted to disk!
  15. PostgreSQL: Atomicity & Durability contacts table transactions table xnum status

    9001 committed first_name last_name phone_number xnum Seth Thompson 1111111111 2384 Geoffrey Litt 2222222222 3859 Ramsey Nasser 3333333333 9001 Alex Qin 4444444444 9001 Omar Omran 5555555555 9001
  16. NoSQL: Schema Flexibility Pros Plan for uncertain requirements and change

    schemas on the fly Store data records with different attributes Avoid slow JOINs Cons Schema enforces data validity, type checking Many complex data structures can be nicely modeled using a schema
  17. MongoDB: Durability MongoDB write benchmark results (version 2.4.1) Source: https://blog.serverdensity.com/mongodb-benchmarks/

    “Write concern” level Average write time Unacknowledged 0.07ms Acknowledged 0.13ms Journaled to HDD 100x slowdown when you enforce durability! 34.19ms
  18. Customers should be able to view and add items to

    their shopping cart even if… data centers are being destroyed by tornadoes. Source: “Dynamo: Amazon’s Highly Available Key-value “
  19. DynamoDB: Consistency Amazon avoids downtime at ALL costs. Consistency problems

    are rare and can be dealt with by customer service. Plus, good to keep items in shopping carts :)
  20. Things to think about What components of ACID do you

    need? Does schema flexibility make sense for your application? Do you need massive scale? (Probably not) Codebase maturity: old isn’t bad!
  21. The future Grey lines between types of databases PostgreSQL has

    a JSON column type id (INT) first_name (VARCHAR) last_name (VARCHAR) settings (JSON) 1 Seth Thompson {“setting1”: true, “setting2”: [“a”, “b”... 2 Geoffrey Litt {“setting1”: false, “setting2”: [“d”, “c”... 3 Ramsey Nasser {“setting1”: false, “setting2”: [“e”, “f”... 4 Alex Qin {“setting1”: true, “setting2”: [“g”, “h”... 5 Omar Omran {“setting1”: true, “setting2”: [“q”, “s”...
  22. The future 90% of data that exists in the world

    has been generated in the past two years More database choices; choice is good!