Slide 1

Slide 1 text

MongoDB 3.2 FOR GIANT IDEAS Philipp Krenn @xeraa

Slide 2

Slide 2 text

ViennaDB Papers We Love Vienna

Slide 3

Slide 3 text

Electronic Data Interchange (EDI)

Slide 4

Slide 4 text

Changelog https://docs.mongodb.org/manual/release-notes/3.2/ MongoDB 3.0: 2015/03 MongoDB 3.2: 2015/11

Slide 5

Slide 5 text

+0.2 == Major release ! changes allowed . development . production

Slide 6

Slide 6 text

Fancy Document Validation Lookup aka "JOIN" Business Intelligence Storage Engines

Slide 7

Slide 7 text

Internal Config Servers as Replica Sets readConcern Partial Indexes

Slide 8

Slide 8 text

Document Validation

Slide 9

Slide 9 text

Schemaless for the win

Slide 10

Slide 10 text

Schema in your app No duplication

Slide 11

Slide 11 text

Schemaless is a lie

Slide 12

Slide 12 text

Indexes

Slide 13

Slide 13 text

Multiple apps Schema duplication

Slide 14

Slide 14 text

Flexibility + Safety = Document Validation

Slide 15

Slide 15 text

Demo $ mkdir test $ mongod --dbpath test/ --port 27001 --logpath test.log $ mongo --port 27001 > db.version() > db.contact.insert({ phone: "1234", email: "[email protected]", status: "ok" })

Slide 16

Slide 16 text

Validation > db.createCollection("contact", { validator: { $and: [ { phone: { $type: "string" } }, { email: { $regex: /@test\.com$/ } }, { status: { $in: [ "ok", "incomplete" ] } } ] } } )

Slide 17

Slide 17 text

Change to contacts and insert validation rule > db.getCollectionInfos() > db.contacts.insert({ phone: "1234", email: "[email protected]", status: "ok" }) > db.contacts.insert({ phone: "1234" }) > db.contacts.find()

Slide 18

Slide 18 text

Update > db.runCommand({ collMod: "contacts", validator: { $or: [ { phone: { $type: "string" } }, { email: { $regex: /@test\.com$/ } }, { status: { $in: [ "ok", "incomplete" ] } } ] }, validationLevel: "strict", validationAction: "error" })

Slide 19

Slide 19 text

Test > db.getCollectionInfos() > db.contacts.insert({ status: "ok" }) > db.contacts.insert({ status: "foobar" }) > db.runCommand({ insert: "contacts", documents: [ { status: "foobar"} ], bypassDocumentValidation: true })

Slide 20

Slide 20 text

Nested rules > db.runCommand({ collMod: "contacts", validator: { $or: [ { $and: [ { phone: { $type: "string" } }, { email: { $regex: /@test\.com$/ } }, { status: "ok" } ] }, { status: "incomplete" } ] } }) > db.contacts.insert({ name: "philipp", status: "incomplete" })

Slide 21

Slide 21 text

Versioning version is user-defined, type 2 is string > db.runCommand({ collMod: "contacts", validator: { $or: [ { version: { "$exists": false } }, { version: 1, $and: [ { name: { "$exists": true } } ] }, { version: 2, $and: [ { name: { "$exists": true, "$type": 2 } } ] } ] } }) > db.contacts.insert({ test: 1 }) > db.contacts.insert({ name: 1, version: 1 }) > db.contacts.insert({ name: "philipp", version: 2 })

Slide 22

Slide 22 text

Flags: validationLevel: strict | moderate | off If an existing document isn't valid, moderate updates won't validate validationAction: error | warn

Slide 23

Slide 23 text

Limitations Only on insert and update No help why a validation failed $geoNear, $near, $nearSphere, $text, $where

Slide 24

Slide 24 text

Lookup aka "JOIN"

Slide 25

Slide 25 text

We don't need JOINs, we have a rich document structure

Slide 26

Slide 26 text

Ok, we'll have $ref in all drivers

Slide 27

Slide 27 text

Ok, we'll have an aggregation framework

Slide 28

Slide 28 text

Ok, we'll have $lookup in the commercial version

Slide 29

Slide 29 text

Ok, $lookup everywhere Left Outer Equi-Join

Slide 30

Slide 30 text

Some data > db.purchases.insert({ "buyer" : "bill", "item" : "macbook" }) > db.purchases.insert({ "buyer" : "fred", "item" : "macbook" }) > db.purchases.insert({ "buyer" : "john", "item" : "macbook" }) > db.purchases.insert({ "buyer" : "john", "item" : "thinkpad" })

Slide 31

Slide 31 text

Aggregate > db.purchases.aggregate([ { $group: { _id: "$item", total: { $sum: 1 } } } ])

Slide 32

Slide 32 text

More data > db.products.insert({ name: "MacBook Pro", price: 2000, code: "macbook" }) > db.products.insert({ name: "Thinkpad", price: 1800, code: "thinkpad" })

Slide 33

Slide 33 text

Lookup > db.purchases.aggregate([ { $group: { _id: "$item", total: { $sum: 1 } } }, { $lookup: { from: "products", localField: "_id", foreignField: "code", as: "item_details" } } ])

Slide 34

Slide 34 text

Lookup > db.products.aggregate([ { $lookup: { from: "purchases", localField: "code", foreignField: "item", as: "buyers" } } ])

Slide 35

Slide 35 text

Flattened lookup > db.products.aggregate([ { $lookup: { from: "purchases", localField: "code", foreignField: "item", as: "buyers" } }, { $unwind: "$buyers" } ])

Slide 36

Slide 36 text

Flattened and cleaned lookup > db.products.aggregate([ { $lookup: { from: "purchases", localField: "code", foreignField: "item", as: "buyers" } }, { $unwind: "$buyers" }, { $project: { "_id": 0, "name": 1, "code": 1, "price": 1, "buyer": "$buyers.buyer" } } ])

Slide 37

Slide 37 text

More aggregation functions $sample > db.mycoll.aggregate({ $sample: { size: 100 } }) <5% data: Random cursor or index selection >=5% data: Scan entire collection, sort randomly in memory, fetch documents from the top

Slide 38

Slide 38 text

More aggregation functions $stdDevPop, $stdDevSamp, $sqrt, $abs, $log, $log10, $ln, $pow, $exp, $trunc, $ceil, $floor,...

Slide 39

Slide 39 text

Business Intelligence

Slide 40

Slide 40 text

MongoDB Connector for BI

Slide 41

Slide 41 text

Had MongoDB really done the impossible? Had they developed a connector which satisfies all the requirements of NoSQL analytics,...

Slide 42

Slide 42 text

but exposes relational semantics on flat, uniform data, so legacy BI software can handle it? — https://www.linkedin.com/pulse/mongodb-32- now-powered-postgresql-john-de-goes

Slide 43

Slide 43 text

MongoDB had gone from nothing to magic in just a few months [?] — https://www.linkedin.com/pulse/mongodb-32- now-powered-postgresql-john-de-goes

Slide 44

Slide 44 text

It uses a foreign data wrapper with PostgreSQL to provide a relational SQL view into your MongoDB data. — https://docs.mongodb.org/manual/products/bi- connector/

Slide 45

Slide 45 text

"Shit. This is bad news for MongoDB. Really bad." — https://www.linkedin.com/pulse/mongodb-32- now-powered-postgresql-john-de-goes

Slide 46

Slide 46 text

Storage Engines

Slide 47

Slide 47 text

MongoDB is webscale

Slide 48

Slide 48 text

Pluggable storage engine in 3.0 MMAPv1 and WiredTiger

Slide 49

Slide 49 text

Structure B-Tree LSM-Tree Fractal-Tree

Slide 50

Slide 50 text

MMAPv1 Default before 3.2 OS provides memory mapped files B-Tree

Slide 51

Slide 51 text

MMAPv1 Collection-level locking Documents a single block

Slide 52

Slide 52 text

WiredTiger Default in 3.2 Currently B-Tree Maybe LSM-Tree in the future

Slide 53

Slide 53 text

WiredTiger Compression Document-level locking Index prefix compression

Slide 54

Slide 54 text

WiredTiger Experience Much faster inserts Watch out for storage.wiredTiger.engineConfig.cacheSizeGB max(1GB, 60%-1GB)

Slide 55

Slide 55 text

Enterprise Feature Encrypted Storage Engine

Slide 56

Slide 56 text

Enterprise Feature In-Memory Storage Engine Currently in Beta

Slide 57

Slide 57 text

Percona RocksDB and PerconaFT

Slide 58

Slide 58 text

Config Servers as Replica Sets

Slide 59

Slide 59 text

Before 3.2 Exactly 3 independent config servers Read and write with 3 available nodes Read-only with less

Slide 60

Slide 60 text

With 3.2 Can be a replica set With up to 50 members

Slide 61

Slide 61 text

readConcern

Slide 62

Slide 62 text

No content

Slide 63

Slide 63 text

@aphyr Call Me Maybe

Slide 64

Slide 64 text

writeConcern { w: , j: , wtimeout: }

Slide 65

Slide 65 text

https://aphyr.com/posts/322-jepsen-mongodb-stale-reads

Slide 66

Slide 66 text

Dirty read Local state on minority primary modified before confiming with majority of nodes !

Slide 67

Slide 67 text

Stale read Stale data on minority primary

Slide 68

Slide 68 text

readConcern { level: <"majority"|"local"> }

Slide 69

Slide 69 text

Partial Indexes

Slide 70

Slide 70 text

> db.users.insert({ username: "philipp", active: 1 }) > db.users.createIndex( { username: 1 }, { partialFilterExpression: { active: { $eq: 1 } } } ) > db.users.find({ username: "philipp", active: { $eq: 1} })

Slide 71

Slide 71 text

> db.users.find({ username: "philipp", active: { $eq: 1} }).explain() { ... "winningPlan": { "stage": "FETCH", "filter": { "active": { "$eq": 1 } }, "inputStage": { "stage": "IXSCAN", "keyPattern": { "username": 1 }, ... > db.users.find({ username: "philipp" }).explain() { ... "winningPlan": { "stage": "COLLSCAN", "filter": { "username": { "$eq": "philipp" } }, ...

Slide 72

Slide 72 text

Conclusion

Slide 73

Slide 73 text

Some fancy additions Lots of internal cleanup

Slide 74

Slide 74 text

Personal Highlight Backup role can't read system.profile https://jira.mongodb.org/browse/SERVER-21724 in 3.2.1 / 3.0.9

Slide 75

Slide 75 text

PS: Robomongo Indiegogo and 3.0+ support

Slide 76

Slide 76 text

Announcements Datomic Feb 11 N1QL Mar 08

Slide 77

Slide 77 text

Thank you! Questions? @xeraa