Upgrade to Pro — share decks privately, control downloads, hide ads and more …

NoSQL – Back to the Future or Yet Another DB Fe...

NoSQL – Back to the Future or Yet Another DB Feature?

A deconstruction of NoSQL – all carried out by an arrogant guy.

Martin Scholl

May 30, 2012
Tweet

More Decks by Martin Scholl

Other Decks in Programming

Transcript

  1. “For those of you who think we are engaged in

    some sort of darwinian processes that make things better for us, it’s actually quite the opposite.” – Alan Kay, 2011 http://bit.ly/AlanKay2011
  2. infinipool NoSQL – Back to the Future or Yet Another

    DB Feature? A deconstruction of NoSQL – all carried out by an arrogant guy.
  3. Martin Scholl, infinipool GmbH martin@infinipool.com @zeit_geist Disclaimer: What follows are

    opinion statements by an otherwise unimportant guy. Pictures are copyrighted by their respective owners. On Database History and NoSQL
  4. Martin Scholl, infinipool GmbH martin@infinipool.com @zeit_geist Disclaimer: What follows are

    opinion statements by an otherwise unimportant guy. Pictures are copyrighted by their respective owners. On Database History and NoSQL Studied all the different *SQL-Systems out there. Still having data issues. (Dr. Faustus)
  5. Martin Scholl, infinipool GmbH martin@infinipool.com @zeit_geist Disclaimer: What follows are

    opinion statements by an otherwise unimportant guy. Pictures are copyrighted by their respective owners. On Database History and NoSQL Studied all the different *SQL-Systems out there. Still having data issues. (Dr. Faustus) I am the spirit that denies NoSQL. (Mephisto. That’s me.)
  6. infinipool NoSQL vs Reality • Data is scattered all over

    NoSQL land! • No (simple) way to ensure various quality domains of data • timeliness and appropriateness • correctness and consistency • Data Integration and Data Quality assurance becomes a full-stack concern!
  7. infinipool Calvin: Fast Distributed Transactions for Partitioned Database Systems[1] [1]

    http://cs-www.cs.yale.edu/homes/dna/papers/calvin-sigmod12.pdf
  8. infinipool Calvin: Fast Distributed Transactions for Partitioned Database Systems[1] [1]

    http://cs-www.cs.yale.edu/homes/dna/papers/calvin-sigmod12.pdf No Excuses!
  9. infinipool MySQL Cluster 7.2 (preview) • 30 node MySQL cluster

    • sporting 19.5m update transactions per second[1] • yet, it’s Oracle and we all know its a benchmark business. [1] http://mikaelronstrom.blogspot.co.uk/2012/05/mysql-cluster-727-achieves-1bn-update.html
  10. Good old Pre-SQL Times: The IBM 704 Filesystems and Databases.

    (c) Lawrence Livermore National Laboratory
  11. Pre-SQL Databases: Files • “Data is stored in files with

    interface between programs and files” • Separation and Isolation: Every program has its own files and formats • Duplication, Synchronization, Consistency: Programs share data. Data is not necessarily synchronized or in a consistent state. • Weak Security, High maintenance Costs http://www.comphist.org/computing_history/new_page_9.htm
  12. infinipool Databases over Files Databases over Files (1960’s) NoSQL DBs

    (Cassandra, HBase, Riak, etc.) Separation & Isolation Duplication, Synchronization, Consistency, Security, Maintenance Costs Every program has its own files and formats Every Data Store has its own APIs and Data Models Programs share data. Data not necessarily consistent, synchronized or consistent Content Transferred into Hadoop. Limited consistency by data model Almost no security; manual data processes Almost no security; Specialized personnel required
  13. Edgar Frank ‘Ted’ Codd • Landmark Paper: “A Relational Model

    of Data for Large Shared Data Banks” • Father of Relational Database Management Systems • Basically invented what Twitter and FB run on • Now a +$12B business • we owe him more than an applause.
  14. infinipool Relational Database Model: The Good Parts • Key Insight:

    Separate Logical Data Model from Physical Data Storage • Radical Simplification of Data Access • A phenomenal tool was introduced: Joins • great for “single data insert, multiple views of data”
  15. infinipool Relational Databases and NoSQL Relational Databases NoSQL DBs (Cassandra,

    HBase, Riak, etc.) Logical & Physical Data Model Duplication, Synchronization, Consistency, Downsides Separated Complected Normalization; Constraints for improved data quality Denormalization; Data Quality an Application- level Problem Scalability Issues; some DBMSs quite expensive Almost no security; Specialized personnel required
  16. 1964 It’s Mainframes all over Software is not a product

    Databases over Files 1980 It’s Minicomputers all over Software becomes a product Relational DBMS + SQL
  17. 1964 It’s Mainframes all over Software is not a product

    Databases over Files 1980 It’s Minicomputers all over Software becomes a product Relational DBMS + SQL 2012
  18. 1964 It’s Mainframes all over Software is not a product

    Databases over Files 1980 It’s Minicomputers all over Software becomes a product Relational DBMS + SQL It’s Cloud Computing all over Software becomes a Service 2012
  19. 1964 It’s Mainframes all over Software is not a product

    Databases over Files 1980 It’s Minicomputers all over Software becomes a product Relational DBMS + SQL It’s Cloud Computing all over Software becomes a Service NoSQL? 2012
  20. 1964 It’s Mainframes all over Software is not a product

    Databases over Files 1980 It’s Minicomputers all over Software becomes a product Relational DBMS + SQL It’s Cloud Computing all over Software becomes a Service NoSQL? 2012 So far: Every HW iteration a new DB Technology Is Cloud Computing a backlash? Will NoSQL prevail?
  21. infinipool Changing Issues in Data Management • Scalability of data

    storage and transactional access is solved. Everybody can (soon) rent the perfect data storage system in the cloud. • Issue #1: Data-Integration an open task • Issue #2: Data-Quality an open task • Issue #3: Push-based execution model: where are thou?
  22. infinipool Changing Issues in Data Management • Scalability of data

    storage and transactional access is solved. Everybody can rent the perfect data storage system in the cloud. • Issue #1: Data-Integration an open task • Issue #2: Data-Quality an open task • Issue #3: Push-based execution model: where are thou? • The new competitive frontier: Timeliness, Data Integration and Quality
  23. Claim #2: NoSQL Technology is a step back. Claim #1:

    NoSQL will become yet another DB Feature and/or Cloud Computing Service.
  24. Claim #2: NoSQL Technology is a step back. Claim #1:

    NoSQL will become yet another DB Feature and/or Cloud Computing Service. PostSQL Databases will be indistinguishable from Data Communication Services. Claim #3:
  25. Claim #2: NoSQL Technology is a step back. Claim #1:

    NoSQL will become yet another DB Feature and/or Cloud Computing Service. PostSQL Databases will be indistinguishable from Data Communication Services. Claim #3: