Slide 1

Slide 1 text

NoSQL Distilled Its not a free lunch #NoSQLDistilled @pramodsadalage [email protected]

Slide 2

Slide 2 text

Why RDBMS?

Slide 3

Slide 3 text

ACID Transactions Atomicity Consistency Isolation Durability

Slide 4

Slide 4 text

Standard Query Interface(SQL) SELECT name,age,startdate FROM student SELECT startdate, count(*) FROM student GROUP BY startdate

Slide 5

Slide 5 text

Interact with many languages def exec_sql_return_rows(sql) rows = Array.new $db_connection.exec sql do | row | rows << row[0] end rows end try { statement = connection.createStatement(); resultSet = statement.executeQuery(sql); process(resultSet); } catch(SQLException e) { //deal with exception }

Slide 6

Slide 6 text

Everyone knows SQL

Slide 7

Slide 7 text

Limit less indexing

Slide 8

Slide 8 text

Handles many data models

Slide 9

Slide 9 text

Why NoSQL

Slide 10

Slide 10 text

Schema changes are hard

Slide 11

Slide 11 text

Impedance mismatch

Slide 12

Slide 12 text

Application vs Integration databases Integration Database Common for all used cases E-Commerce Recommendation Application Database Specialized storage for each used cases E-Commerce Recommendation Service Integration

Slide 13

Slide 13 text

Running on clusters

Slide 14

Slide 14 text

Un-Structured Data

Slide 15

Slide 15 text

Un-Even rate of data growth

Slide 16

Slide 16 text

Domain Models

Slide 17

Slide 17 text

Domain driven data models

Slide 18

Slide 18 text

RDBMS data

Slide 19

Slide 19 text

Aggregate model (Embedding objects)

Slide 20

Slide 20 text

// in customers { "id": 1, "name": "Martin", "billingAddress": [{"city": "Chicago"}], "orders": [ { "id":99, "orderItems":[ { "productId":27, "price": 32.45, "productName": "NoSQL Distilled" } ], "shippingAddress":[{"city":"Chicago"}] "orderPayment":[ { "ccinfo":"1000-1000-1000-1000", "txnId":"abelif879rft", "billingAddress": {"city": "Chicago"} } ], } ] } Representing Aggregate Data Key

Slide 21

Slide 21 text

Aggregate model (Referencing Objects)

Slide 22

Slide 22 text

// in Customers { "id":1, "name":"Martin", "billingAddress":[{"city":"Chicago"}] } // in Orders { "id":99, "customerId":1, "orderItems":[ { "productId":27, "price": 32.45, "productName": "NoSQL Distilled" } ], "shippingAddress":[{"city":"Chicago"}] "orderPayment":[ { "ccinfo":"1000-1000-1000-1000", "txnId":"abelif879rft", "billingAddress": {"city": "Chicago"} } ], } Representing Aggregate data Key Key Reference Key

Slide 23

Slide 23 text

Aggregate Orientation

Slide 24

Slide 24 text

RDBMS’s have no concept of aggregates

Slide 25

Slide 25 text

Aggregates reduce the need for ACID

Slide 26

Slide 26 text

Better for clusters, can be distributed easily

Slide 27

Slide 27 text

Key-Value Document Column-Family

Slide 28

Slide 28 text

Key Value Databases

Slide 29

Slide 29 text

Key-Value Database •One Key-One Value •Value is opaque to database •Like a Hash •Some are distributed Oracle Riak instance cluster table bucket row key-value row-id key

Slide 30

Slide 30 text

“key” (VIN) “value” (car facts) JTTDR… … make#Ford model#Mustang year#2011 ….

Slide 31

Slide 31 text

Session Storage

Slide 32

Slide 32 text

User Profiles/Preferences

Slide 33

Slide 33 text

Shopping Cart

Slide 34

Slide 34 text

Single user analytics

Slide 35

Slide 35 text

Document Databases

Slide 36

Slide 36 text

Document Database •One Key-One Value •Value is visible to database •Value can be queries •JSON/XML documents Oracle MongoDB instance mongod schema database table collection row document row_id _id

Slide 37

Slide 37 text

“id” (VIN) “document” (car facts) JTTDR… {… “make”: “Ford”, “model”: “Mustang”, “year”: 2011, … }

Slide 38

Slide 38 text

Event Logging

Slide 39

Slide 39 text

Prototype development

Slide 40

Slide 40 text

eCommerce Application

Slide 41

Slide 41 text

Content Management Applications

Slide 42

Slide 42 text

Column-Family Databases

Slide 43

Slide 43 text

Column-Family Database •Data organized as columns •Each row has row key •Columns have versioned data •Row data is sorted by column name Oracle Cassandra instance cluster database keyspace table column-family row row columns same for every row columns can be different for each row

Slide 44

Slide 44 text

“id” (VIN) “column families”(car facts) JTTDR… {… “car”:{“make”: “Ford”, “model”: “focus”…} “service”:{…} }

Slide 45

Slide 45 text

Large write volume

Slide 46

Slide 46 text

Content Management

Slide 47

Slide 47 text

eCommerce Application

Slide 48

Slide 48 text

Graph Databases

Slide 49

Slide 49 text

No content

Slide 50

Slide 50 text

Graph Databases •Is multi-relational graph •Relationships are first- class citizens •Traversal algorithms •Nodes and Edges can have data (key-value pairs)

Slide 51

Slide 51 text

Graph databases work best for data with complex relations

Slide 52

Slide 52 text

Connected Data

Slide 53

Slide 53 text

Routing things/money

Slide 54

Slide 54 text

Location Services

Slide 55

Slide 55 text

Recommendation engines

Slide 56

Slide 56 text

Schema-less really?

Slide 57

Slide 57 text

Schema-free does not mean no schema/data-migration

Slide 58

Slide 58 text

Schema is implicit in code

Slide 59

Slide 59 text

Data must be migrated, when schema in code changes

Slide 60

Slide 60 text

All data need not be migrated at the same time (lazy migration)

Slide 61

Slide 61 text

Polyglot Persistence

Slide 62

Slide 62 text

Use different data storage technology for varying needs

Slide 63

Slide 63 text

Can be across the enterprise or in single application

Slide 64

Slide 64 text

Encapsulate data access through services

Slide 65

Slide 65 text

Order persistence service Document store e-commerce platform Session/Cart storage service Key-Value store Inventory and Price service RDBMS (Legacy DB) Nodes and Relations service Graph store Shopping cart and session data Completed Orders Inventory and Item Price Customer social graph

Slide 66

Slide 66 text

RDBMS e-commerce platform Shopping cart data Completed Orders Session data SOLR Search requests Update Indexed Data Update indexed data, batch or realtime

Slide 67

Slide 67 text

Did I not say something about free lunch?

Slide 68

Slide 68 text

Given all the choice, the decision to choose the right database is yours.

Slide 69

Slide 69 text

Choose for programmer productivity

Slide 70

Slide 70 text

Choose for data access performance

Slide 71

Slide 71 text

Choose to stick with the default

Slide 72

Slide 72 text

Choose by testing your expectations

Slide 73

Slide 73 text

Try the databases, they are all open-source

Slide 74

Slide 74 text

Thanks #NoSQLDistilled @pramodsadalage sadalage.com