Upgrade to Pro — share decks privately, control downloads, hide ads and more …

NoSQL Distilled - Pramod Sadalage - Agile SG 2016

NoSQL Distilled - Pramod Sadalage - Agile SG 2016

Presented in Agile Singapore 2016 conference


Agile Singapore

October 06, 2016

More Decks by Agile Singapore

Other Decks in Technology


  1. NoSQL Distilled Its not a free lunch #NoSQLDistilled @pramodsadalage pramod@thoughtworks.com

  2. Why RDBMS?

  3. ACID Transactions Atomicity Consistency Isolation Durability

  4. Standard Query Interface(SQL) SELECT name,age,startdate FROM student SELECT startdate, count(*)

    FROM student GROUP BY startdate
  5. Interact with many languages def exec_sql_return_rows(sql) rows = Array.new $db_connection.exec

    sql do | row | rows << row[0] end rows end try { statement = connection.createStatement(); resultSet = statement.executeQuery(sql); process(resultSet); } catch(SQLException e) { //deal with exception }
  6. Everyone knows SQL

  7. Limit less indexing

  8. Handles many data models

  9. Why NoSQL

  10. Schema changes are hard

  11. Impedance mismatch

  12. Application vs Integration databases Integration Database Common for all used

    cases E-Commerce Recommendation Application Database Specialized storage for each used cases E-Commerce Recommendation Service Integration
  13. Running on clusters

  14. Un-Structured Data

  15. Un-Even rate of data growth

  16. Domain Models

  17. Domain driven data models

  18. RDBMS data

  19. Aggregate model (Embedding objects)

  20. // in customers { "id": 1, "name": "Martin", "billingAddress": [{"city":

    "Chicago"}], "orders": [ { "id":99, "orderItems":[ { "productId":27, "price": 32.45, "productName": "NoSQL Distilled" } ], "shippingAddress":[{"city":"Chicago"}] "orderPayment":[ { "ccinfo":"1000-1000-1000-1000", "txnId":"abelif879rft", "billingAddress": {"city": "Chicago"} } ], } ] } Representing Aggregate Data Key
  21. Aggregate model (Referencing Objects)

  22. // in Customers { "id":1, "name":"Martin", "billingAddress":[{"city":"Chicago"}] } // in

    Orders { "id":99, "customerId":1, "orderItems":[ { "productId":27, "price": 32.45, "productName": "NoSQL Distilled" } ], "shippingAddress":[{"city":"Chicago"}] "orderPayment":[ { "ccinfo":"1000-1000-1000-1000", "txnId":"abelif879rft", "billingAddress": {"city": "Chicago"} } ], } Representing Aggregate data Key Key Reference Key
  23. Aggregate Orientation

  24. RDBMS’s have no concept of aggregates

  25. Aggregates reduce the need for ACID

  26. Better for clusters, can be distributed easily

  27. Key-Value Document Column-Family

  28. Key Value Databases

  29. Key-Value Database •One Key-One Value •Value is opaque to database

    •Like a Hash •Some are distributed Oracle Riak instance cluster table bucket row key-value row-id key
  30. “key” (VIN) “value” (car facts) JTTDR… … make#Ford model#Mustang year#2011

  31. Session Storage

  32. User Profiles/Preferences

  33. Shopping Cart

  34. Single user analytics

  35. Document Databases

  36. Document Database •One Key-One Value •Value is visible to database

    •Value can be queries •JSON/XML documents Oracle MongoDB instance mongod schema database table collection row document row_id _id
  37. “id” (VIN) “document” (car facts) JTTDR… {… “make”: “Ford”, “model”:

    “Mustang”, “year”: 2011, … }
  38. Event Logging

  39. Prototype development

  40. eCommerce Application

  41. Content Management Applications

  42. Column-Family Databases

  43. Column-Family Database •Data organized as columns •Each row has row

    key •Columns have versioned data •Row data is sorted by column name Oracle Cassandra instance cluster database keyspace table column-family row row columns same for every row columns can be different for each row
  44. “id” (VIN) “column families”(car facts) JTTDR… {… “car”:{“make”: “Ford”, “model”:

    “focus”…} “service”:{…} }
  45. Large write volume

  46. Content Management

  47. eCommerce Application

  48. Graph Databases

  49. None
  50. Graph Databases •Is multi-relational graph •Relationships are first- class citizens

    •Traversal algorithms •Nodes and Edges can have data (key-value pairs)
  51. Graph databases work best for data with complex relations

  52. Connected Data

  53. Routing things/money

  54. Location Services

  55. Recommendation engines

  56. Schema-less really?

  57. Schema-free does not mean no schema/data-migration

  58. Schema is implicit in code

  59. Data must be migrated, when schema in code changes

  60. All data need not be migrated at the same time

    (lazy migration)
  61. Polyglot Persistence

  62. Use different data storage technology for varying needs

  63. Can be across the enterprise or in single application

  64. Encapsulate data access through services

  65. Order persistence service Document store e-commerce platform Session/Cart storage service

    Key-Value store Inventory and Price service RDBMS (Legacy DB) Nodes and Relations service Graph store Shopping cart and session data Completed Orders Inventory and Item Price Customer social graph
  66. RDBMS e-commerce platform Shopping cart data Completed Orders Session data

    SOLR Search requests Update Indexed Data Update indexed data, batch or realtime
  67. Did I not say something about free lunch?

  68. Given all the choice, the decision to choose the right

    database is yours.
  69. Choose for programmer productivity

  70. Choose for data access performance

  71. Choose to stick with the default

  72. Choose by testing your expectations

  73. Try the databases, they are all open-source

  74. Thanks #NoSQLDistilled @pramodsadalage sadalage.com