Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Think you know MongoDB?

Think you know MongoDB?

Talk about MongoDB at the Munich PHP User Group.

alcaeus

June 20, 2023
Tweet

More Decks by alcaeus

Other Decks in Programming

Transcript

  1. MongoDB Atlas Free clusters • M0 shared clusters are free

    • Allows quick testing without stability guarantees • Limitations apply • Don’t run production on it! • Really, don’t!
  2. MongoDB Atlas Dedicated Clusters • Support production usage • More

    storage and automatic scaling • Automatic cluster tier scaling • Supports backups and point-in-time restore • Includes Enterprise features
  3. MongoDB Atlas Serverless tier • Minimal con fi guration •

    Production-ready for variable workloads • Great for startups
  4. On-premise deployments You’re in charge • Deploy MongoDB on your

    own hardware • Decide which cluster con fi guration to run • Manage upgrades and backups yourself • Community Edition is free • Enterprise Edition for advanced features
  5. The Database Document model { "name": "alcaeus", "email": [ {

    "type": "work", "address": "<redacted>" }, { "type": "private", "address": "<redacted>" } ], "phone": [ { "type": "mobile", "number": "<redacted>" }, { "type": "home", "number": "<redacted>" } ] }
  6. Drivers Your connection to MongoDB • Connect your application to

    a MongoDB deployment • Monitor deployment and route commands • Provide a rich database API
  7. Drivers Putting the “No” in NoSQL • Driver API is

    common across programming languages • Client object to interact with a deployment • Database object to interact with a database • Collection object to interact with a collection • Driver speci fi cation is public
  8. PHP Driver Collection API interface Collection { public function insertMany(array

    $documents, array $options = []); public function insertOne($document, array $options = []); public function find($filter = [], array $options = []); public function findOne($filter = [], array $options = []); public function updateMany($filter, $update, array $options = []); public function updateOne($filter, $update, array $options = []); public function deleteMany($filter, array $options = []); public function deleteOne($filter, array $options = []); }
  9. PHP Driver Collection API (2) interface Collection { public function

    aggregate(array $pipeline, array $options = []); public function findOneAndUpdate($filter, $update, array $options = []); public function findOneAndDelete($filter, array $options = []); public function watch(array $pipeline = [], array $options = []); }
  10. Queries Atomic array operations $collection->updateOne( ['_id' => 1], [ '$currentDate'

    => ['updated_at' => true], '$pull' => ['vegetables' => 'tomato'], '$push' => ['fruits' => 'tomato'], ] );
  11. Queries Atomic update errors $collection->updateOne( ['_id' => 1], [ '$pull'

    => ['fruits' => 'strawberry'], '$push' => ['fruits' => 'tomato'], ] ); // Exception: Updating the path 'fruits' would create a conflict at 'fruits'
  12. Queries Separate updates for multiple fi elds $collection->updateOne( ['_id' =>

    1], ['$pull' => ['fruits' => 'strawberry']] ); $collection->updateOne( ['_id' => 1], ['$push' => ['fruits' => 'tomato']] );
  13. Queries Use transactions use function MongoDB\with_transaction; $session = $client->startSession(); with_transaction($session,

    function (Session $session) use ($collection) { $collection->updateOne( ['_id' => 1], ['$pull' => ['fruits' => 'strawberry']], ['session' => $session] ); $collection->updateOne( ['_id' => 1], ['$push' => ['fruits' => 'tomato']], ['session' => $session] ); });
  14. Document Model Topic data { "_id": { "$oid": "648ffe160a8daeb5e50506a3" },

    "title": "What do you think about MongoDB?", "author_id": { "$oid": "648ffe160a8daeb5e50506a2" }, "num_posts": 1, "created_at": { "$date": { "$numberLong": "1687158294045" } }, "posts": [ { "_id": { "$oid": "648ffe160a8daeb5e50506a4" }, "author_id": { "$oid": "648ffe160a8daeb5e50506a2" }, "body": "I love it, what about you?", "created_at": { "$date": { "$numberLong": "1687158294045" } } } ] }
  15. Document Model Identi fi ers { "title": "What do you

    think about MongoDB?", "author_id": { "$oid": "648ffe160a8daeb5e50506a2" }, "num_posts": 1, "created_at": { "$date": { "$numberLong": "1687158294045" } }, "posts": [ { "_id": { "$oid": "648ffe160a8daeb5e50506a4" }, "author_id": { "$oid": "648ffe160a8daeb5e50506a2" }, "body": "I love it, what about you?", "created_at": { "$date": { "$numberLong": "1687158294045" } } } ] } "_id": { "$oid": "648ffe160a8daeb5e50506a3" },
  16. Document Model ObjectIds contain timestamps { "_id": { "$oid": "

    0a8daeb5e50506a3" }, "title": "What do you think about MongoDB?", "author_id": { "$oid": "648ffe160a8daeb5e50506a2" }, "num_posts": 1, "created_at": { "$date": { "$numberLong": "1687158294045" } }, "posts": [ { "_id": { "$oid": "648ffe160a8daeb5e50506a4" }, "author_id": { "$oid": "648ffe160a8daeb5e50506a2" }, "body": "I love it, what about you?", "created_at": { "$date": { "$numberLong": "1687158294045" } } } ] } 648ffe16
  17. Document Model ObjectIds contain timestamps { "_id": { "$oid": "

    0a8daeb5e50506a3" }, "title": "What do you think about MongoDB?", "author_id": { "$oid": "648ffe160a8daeb5e50506a2" }, "num_posts": 1, "created_at": { "$date": { "$numberLong": " 045" } }, "posts": [ { "_id": { "$oid": "648ffe160a8daeb5e50506a4" }, "author_id": { "$oid": "648ffe160a8daeb5e50506a2" }, "body": "I love it, what about you?", "created_at": { "$date": { "$numberLong": "1687158294045" } } } ] } 648ffe16 1687158294
  18. Document Model No more created_at { "_id": { "$oid": "648ffe160a8daeb5e50506a3"

    }, "title": "What do you think about MongoDB?", "author_id": { "$oid": "648ffe160a8daeb5e50506a2" }, "num_posts": 1, "posts": [ { "_id": { "$oid": "648ffe160a8daeb5e50506a4" }, "author_id": { "$oid": "648ffe160a8daeb5e50506a2" }, "body": "I love it, what about you?" } ] }
  19. Document Model Embed data { "_id": { "$oid": "648ffe160a8daeb5e50506a3" },

    "title": "What do you think about MongoDB?", "author": { "_id": { "$oid": "648ffe160a8daeb5e50506a2" }, "name": "alcaeus", "image": "..." }, "num_posts": 1, "posts": [ { "_id": { "$oid": "648ffe160a8daeb5e50506a4" }, "author": { ... }, "body": "I love it, what about you?" } ] }
  20. Document Model Don’t be afraid to duplicate data $collection->updateMany( ['author._id'

    => $author['_id']], ['$set' => ['author.name' => $author['name']]] ); $collection->updateMany( [], ['$set' => ['posts.$[post].author.name' => $author['name']]], ['arrayFilters' => [['post.author._id' => $author['_id']]]] );
  21. Document Model Atomic updates $collection->updateOne( ['_id' => $topic['_id']], [ '$push'

    => [ 'posts' => [ '_id' => new MongoDB\BSON\ObjectId(), 'author' => $author, 'body' => $body, ], ], '$inc' => ['numPosts' => 1], ] );
  22. Document Model It gets BIG { "_id": { "$oid": "648ffe160a8daeb5e50506a3"

    }, "title": "What do you think about MongoDB?", "author": { "_id": { "$oid": "648ffe160a8daeb5e50506a2" }, "name": "alcaeus", "image": "..." }, "num_posts": 14290, "posts": [ ... ] }
  23. Document Limits Just because you can… • Documents have a

    maximum size of 16 MB • Documents support 255 levels of nesting • Embedding same data multiple times is an anti-pattern • Ever-growing arrays are problematic
  24. Document Model Don’t embed everything { "_id": { "$oid": "648ffe160a8daeb5e50506a3"

    }, "title": "What do you think about MongoDB?", "num_replies": 14289 } "post": { "author": { "_id": { "$oid": "648ffe160a8daeb5e50506a2" }, "name": "alcaeus", "image": "..." }, "body": "I love it, what about you?" },
  25. Document Model Embed relevant data { "_id": { "$oid": "648ffe160a8daeb5e50506a3"

    }, "title": "What do you think about MongoDB?", "post": { ... }, "num_replies": 14289, } "last_reply": { "_id": { "$oid": "648ffe160a8daeb5e50506ce" }, "author": { "_id": { "$oid": "648ffe160a8daeb5e50506a5" }, "name": "youdontknowme", "image": "..." }, "body": "I have a lot to learn about it!" }
  26. Document Model The other side: posts { "_id": { "$oid":

    "648ffe160a8daeb5e50506ce" }, "author": { "_id": { "$oid": "648ffe160a8daeb5e50506a5" }, "name": "youdontknowme", "image": "..." }, "body": "I still have a lot to learn about it!", "topic": { "_id": { "$oid": "648ffe160a8daeb5e50506a3" }, "title": "What do you think about MongoDB?" } }
  27. Flexible Schema Let’s add polls { "_id": { "$oid": "648ffe160a8daeb5e50506a3"

    }, "title": "What do you think about MongoDB?", "post": { ... }, "num_replies": 14289, "last_reply": { ... } } "poll": { "question": "Have you used MongoDB?", "options": [ { "title": "Yes" }, { "title": "No" } ] },
  28. Schema Updates Add tables and columns CREATE TABLE polls (...);

    ALTER TABLE topics ADD COLUMN poll_id INTEGER DEFAULT NULL;
  29. Schema Validation Supports JSON schema { "required": ["_id", "author", "title",

    "post", "num_replies"], "properties": { "_id": { "bsonType": "objectId" }, "author": { "bsonType": "object", "required": ["_id", "name"], "properties": { "_id": { "bsonType": "objectId" }, "name": { "bsonType": "string" }, "image": { "bsonType": "string" } } }, "title": { "bsonType": "string"}, ... } }
  30. Schema Validation Supports JSON schema db.createCollection( "topics", { validator: {

    $jsonSchema: mySchema }, validationLevel: "moderate", } );
  31. Aggregation Pipeline Raw GPS data { "document": { "_id": {

    "$oid": "6490311fd97a12c964c25030" }, "time": { "$numberDouble": "1685866107.024" }, "latitude": { "$numberDouble": "44.8859" }, "longitude": { "$numberDouble": "13.873914" }, "altitude": { "$numberDouble": "35.092" }, "speed": { "$numberDouble": "0.663" } } }
  32. Aggregation Pipeline Modify data [ { '$addFields': { 'time': {

    '$toDate': { '$multiply': [ '$time', 1000 ] } }, 'position': { 'type': 'point', 'coordinates': [ '$longitude', '$latitude' ] } } } ]
  33. Aggregation Pipeline Complex operations [ { '$setWindowFields': { 'sortBy': {

    'time': 1 }, 'output': { 'previousPosition': { '$shift': { 'by': -1, 'default': null, 'output': { 'time': '$time', 'latitude': '$latitude', 'longitude': '$longitude' } } } } }}, { '$addFields': { 'bearing': { '$radiansToDegrees': { '$atan2': [ { '$multiply': [ { '$cos': '$latitude' }, { '$sin': { '$subtract': [ '$longitude', '$previousPosition.longitude' ] } } ]}, { '$subtract': [ { '$multiply': [ { '$cos': '$previousPosition.latitude' }, { '$sin': '$latitude' } ]}, { '$multiply': [ { '$sin': '$previousPosition.latitude' }, { '$cos': '$latitude' }, { '$cos': { '$subtract': ['$longitude', '$previousPosition.longitude'] } } ]} ]} ] } } }} ]
  34. Aggregation Pipeline Use for views db.createView( "telemetry", // View name

    "gpsdata", // Source collection [ { '$addFields': { 'time': { '$toDate': { '$multiply': [ '$time', 1000 ] } }, 'position': { 'type': 'point', 'coordinates': [ '$longitude', '$latitude' ] } } }, // ... more stages ] );
  35. Aggregation Pipeline On-demand materialised views { '$merge': { 'into': 'telemetry',

    'on': '_id', 'whenMatched': 'replace', 'whenNotMatched': 'insert' } }
  36. MongoDB Atlas Data Federation • Query and move data from

    various sources • Atlas Clusters • Atlas Data Lake • AWS S3 buckets • HTTPS endpoints
  37. Data API Request $ curl --location --request POST 'https://.../v1/action/findOne' \

    --header 'Content-Type: application/json' \ --header 'Access-Control-Request-Headers: *' \ --header 'api-key: 0fUMoHadYlL2QiI5vxQ7HyWIxNZ6jwHdFAyrb4zUB2ZZlGozFdXX5aiLNDYQu3K1' \ --header 'Accept: application/ejson' \ --data-raw '{ "collection":"gpsdata", "database":"karting", "dataSource":"Cluster95038" }'
  38. Data API Response { "document": { "_id": { "$oid": "6490311fd97a12c964c25030"

    }, "time": { "$numberDouble": "1685866107.024" }, "latitude": { "$numberDouble": "44.8859" }, "longitude": { "$numberDouble": "13.873914" }, "altitude": { "$numberDouble": "35.092" }, "speed": { "$numberDouble": "0.663" } } }
  39. MongoDB Atlas Atlas Search • Full-text search embedded in MongoDB

    Atlas • Built on Apache Lucene • Search data directly without duplicating it
  40. MongoDB Atlas Other features • Atlas Data Lake • Triggers

    • Encryption at Rest • Self-managed X.509 Authentication • App Services • In-Use Encryption
  41. Don’t Trust Anyone Encrypt your data • MongoDB supports Client-Side

    Field Level Encryption • You have the key, we don’t • Data is encrypted in the driver • Schema to de fi ne encrypted fi elds and keys • Equality queries on deterministically encrypted fi elds only
  42. Don’t Trust Anyone Queryable Encryption • Searchable encryption scheme •

    In public preview, stay tuned • Equality queries on randomised encrypted data • More query types coming soon
  43. What’s Next MongoDB 7.0 • Will be announced Thursday •

    Atlas Search Index management • Delete Time Series data • And more…
  44. What’s Next PHP Driver • Aggregation Pipeline builder • Laravel

    Integration • Native BSON classes • Lazy BSON deserialisation for more performance
  45. Things I didn’t mention There isn’t enough time for everything

    • Mirrored and hedged reads • Time Series collections • Clustered collections • Wildcard Indexes • Cloud Manager • Kubernetes Operator • Compass
  46. Join me at WSC! • Learn schema design patterns •

    JSONb, composite types in PostgreSQL • Migrate existing schema to MongoDB • Leverage aggregation pipeline for complex workloads Want to learn more?