Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Zero to Sixty with MongoDB

Zero to Sixty with MongoDB

Presented July 15, 2017 at OpenWest: https://joind.in/talk/94ea4

Presented October 8, 2016 at Bulgaria PHP: https://joind.in/talk/dde24

Presented May 9, 2015 at OpenWest: http://joind.in/talk/view/14038

Presented May 20, 2014 at php[tek]: https://joind.in/talk/view/10667

Presented November 4, 2013 at Seattle PHP Meetup: http://www.meetup.com/php-49/events/142710822/

Presented September 16, 2013 at Web & PHP Conference: https://joind.in/8871

Presented July 13, 2013 at NYC Camp: http://2013.nyccamp.org/session/getting-acquainted-mongodb

Presented March 2, 2013 at Midwest PHP: https://joind.in/8198

Presented November 3, 2012 at True North PHP: https://joind.in/talk/view/7416

Presented August 30, 2012 as a Zend.com webinar.

Reveal.js presentation published at: http://jmikola.github.com/slides/mongodb_getting_acquainted/

Jeremy Mikola

July 15, 2017
Tweet

More Decks by Jeremy Mikola

Other Decks in Programming

Transcript

  1. ZERO TO SIXTY WITH
    ZERO TO SIXTY WITH
    MONGODB AND PHP
    MONGODB AND PHP
    Jeremy Mikola
    jmikola

    View Slide

  2. STARTING OFF
    STARTING OFF
    What sets MongoDB apart?
    What are documents?
    How do we get up and running?
    What's in a driver?
    How do we read/write documents?
    What else can we do?

    View Slide

  3. TERMINOLOGY
    TERMINOLOGY
    {
    "mongodb" : "relational db",
    "database" : "database",
    "collection" : "table",
    "document" : "row",
    "index" : "index",
    "_id" : "primary key",
    "sharding" : {
    "shard" : "partition",
    "shard key" : "partition key"
    }
    }

    View Slide

  4. RDBMS: THE GOOD PARTS
    RDBMS: THE GOOD PARTS
    Tried and true
    SQL is a rich query language
    ACID compliance
    Transactions

    View Slide

  5. RDBMS: THE BAD PARTS
    RDBMS: THE BAD PARTS
    Modeling complex or polymorphic data
    Schema migrations
    Administration
    Scalability trade-off

    View Slide

  6. MONGODB: THE GOOD PARTS
    MONGODB: THE GOOD PARTS
    Document model
    Flexible schemas
    Scalability and performance
    Features (aggregation, geo, GridFS)

    View Slide

  7. MONGODB: THE BAD PARTS
    MONGODB: THE BAD PARTS
    Limited atomicity
    Consistency trade-off
    Query language limitations

    View Slide

  8. DATABASE LANDSCAPE
    DATABASE LANDSCAPE
    Memcached
    Key/Value
    RDBMS
    Scalability and Performance
    Depth of Functionality

    View Slide

  9. WHY MONGODB?
    WHY MONGODB?
    MongoDB has the best features of key/value
    stores, document databases and relational
    databases in one — John Nunemaker

    View Slide

  10. RELATIONAL MODELING
    RELATIONAL MODELING
    Articles
    One author
    Many comments
    Many tags
    Authors
    Many articles
    Many comments
    Tags
    Many articles
    Comments
    One article
    One author

    View Slide

  11. RELATIONAL MODELING
    RELATIONAL MODELING
    articles
    id author_id title body
    1 2 Praesent ante dui Lorem
    ipsum…
    articles_to_tags
    id article_id tag_id
    36 1 7
    37 1 8
    authors
    id name email
    2 Bob [email protected]
    3 John [email protected]
    comments
    id article_id author_id body
    4 1 3 Morbi libero erat…
    5 1 2 Dapibus quis…
    6 1 3 Fusce
    fermentum…
    tags
    id name
    7 luctus
    8 rhoncus

    View Slide

  12. THINGS MAY GET
    THINGS MAY GET
    OUT OF HAND
    OUT OF HAND

    View Slide

  13. DOCUMENT MODELING
    DOCUMENT MODELING
    Articles
    One author
    Many comments
    One author
    Many tags

    View Slide

  14. DOCUMENT MODELING
    DOCUMENT MODELING
    {
    _id: 1,
    title: "Praesent ante dui",
    body: "Lorem ipsum…",
    author: { name: "Bob", email: "[email protected]" },
    comments: [
    {
    body: "Morbi libero erat…",
    author: { name: "John", email: "[email protected]" }
    },
    {
    body: "Dapibus quis…",
    author: { name: "Tom", email: "[email protected]" }
    },
    ],
    tags: [ "luctus", "rhoncus" ]
    }

    View Slide

  15. DOCUMENTS ARE BSON
    DOCUMENTS ARE BSON
    Zero or more key/value pairs
    Values are scalars, arrays, and objects
    Special types (e.g. binary strings, dates)
    Binary JSON

    View Slide

  16. DOCUMENTS IN PHP
    DOCUMENTS IN PHP
    // Arrays are most common
    $a = ['hello' => 'world'];
    $b = ['things' => ['foo', 5.05, 2012]];
    // Objects work, too!
    $a = new stdClass;
    $a->hello = 'world';
    $b = new stdClass;
    $b->things = ['foo', 5.05, 2012];

    View Slide

  17. GETTING UP AND
    GETTING UP AND
    RUNNING WITH MONGO
    RUNNING WITH MONGO
    IN 60 SECONDS
    IN 60 SECONDS
    * EXCLUDING DOWNLOAD TIME :)
    * EXCLUDING DOWNLOAD TIME :)

    View Slide

  18. MONGODB.COM/DOWNLOAD-CENTER
    MONGODB.COM/DOWNLOAD-CENTER
    Compiled binaries
    OS X, Linux, Windows
    Packages
    MacPorts, Homebrew, Debian, CentOS
    Drivers for over a dozen languages

    View Slide

  19. INSTALLING
    INSTALLING
    $ tar xvzf mongodb-linux-x86_64-ubuntu1604-3.4.6.tgz
    mongodb-linux-x86_64-ubuntu1604-3.4.6/README
    mongodb-linux-x86_64-ubuntu1604-3.4.6/THIRD-PARTY-NOTICES
    mongodb-linux-x86_64-ubuntu1604-3.4.6/MPL-2
    mongodb-linux-x86_64-ubuntu1604-3.4.6/GNU-AGPL-3.0
    mongodb-linux-x86_64-ubuntu1604-3.4.6/bin/mongodump
    mongodb-linux-x86_64-ubuntu1604-3.4.6/bin/mongorestore
    mongodb-linux-x86_64-ubuntu1604-3.4.6/bin/mongoexport
    mongodb-linux-x86_64-ubuntu1604-3.4.6/bin/mongoimport
    mongodb-linux-x86_64-ubuntu1604-3.4.6/bin/mongostat
    mongodb-linux-x86_64-ubuntu1604-3.4.6/bin/mongotop
    mongodb-linux-x86_64-ubuntu1604-3.4.6/bin/bsondump
    mongodb-linux-x86_64-ubuntu1604-3.4.6/bin/mongofiles
    mongodb-linux-x86_64-ubuntu1604-3.4.6/bin/mongooplog
    mongodb-linux-x86_64-ubuntu1604-3.4.6/bin/mongoreplay
    mongodb-linux-x86_64-ubuntu1604-3.4.6/bin/mongoperf
    mongodb-linux-x86_64-ubuntu1604-3.4.6/bin/mongod
    mongodb-linux-x86 64-ubuntu1604-3 4 6/bin/mongos

    View Slide

  20. STARTING
    STARTING
    $ mkdir /data/db
    $ ./mongod
    [initandlisten] MongoDB starting : pid=20357 port=27017 dbpath=/data/db 64-bit
    [initandlisten] db version v3.4.6
    [initandlisten] git version: c55eb86ef46ee7aede3b1e2a5d184a7df4bfb5b5
    [initandlisten] OpenSSL version: OpenSSL 1.0.2g 1 Mar 2016
    [initandlisten] allocator: tcmalloc
    [initandlisten] modules: none
    [initandlisten] build environment:
    [initandlisten] distmod: ubuntu1604
    [initandlisten] distarch: x86_64
    [initandlisten] target_arch: x86_64
    [initandlisten] options: { storage: { dbPath: "/data/db" } }
    [initandlisten] wiredtiger_open config: create,cache_size=7453M,session_max=20
    [initandlisten] Initializing full-time diagnostic data capture with directory
    [thread1] waiting for connections on port 27017

    View Slide

  21. CONNECTING
    CONNECTING
    $ ./mongo
    MongoDB shell version: 3.4.6
    connecting to: test
    > show dbs
    local (empty)
    > db.foo.insert({x: 1})
    > db.foo.find()
    { "_id" : ObjectId("57f7aefab586b9ceed2df190"), "x" : 1 }

    View Slide

  22. MONGODB ATLAS
    MONGODB ATLAS
    Hosted MongoDB as a Service
    Single-click deployments, upgrades, etc.
    Configurable scalability and redundancy
    Integration with MongoDB's backup service
    Secure (authentication, encryption, IP whitelists)
    Free tier available
    More info at mongodb.com/cloud

    View Slide

  23. PHP AND HHVM DRIVERS
    PHP AND HHVM DRIVERS
    extension (ext-mongodb)
    Client, server, and cursor classes
    BSON types and encode/decode functions
    (mongodb/mongodb)
    Client, database, and collection classes
    Higher-level abstractions (e.g. GridFS)
    Installing the and
    PHP
    PHP library
    extension library

    View Slide

  24. LET'S GET COOKING
    LET'S GET COOKING

    View Slide

  25. CORE CLASSES
    CORE CLASSES
    MongoDB\Client
    MongoDB\Database
    MongoDB\Collection
    MongoDB\Driver\Cursor

    View Slide

  26. CONNECTING
    CONNECTING
    // 127.0.0.1:27017
    $client = new MongoDB\Client;
    // example.com:27018
    $client = new MongoDB\Client('mongodb://example.com:27018');
    // authentication
    $client = new MongoDB\Client('mongodb://user:[email protected]');
    // replica set
    $client = new MongoDB\Client(
    'mongodb://rs1.example.com,rs2.example.com',
    ['replicaSet' => 'myReplSet']
    );

    View Slide

  27. THE MONGODB\CLIENT CLASS
    THE MONGODB\CLIENT CLASS
    $client = new MongoDB\Client;
    // Select the "test" database
    $db = $client->test;
    $db = $client->selectDatabase('test');
    // Select the "test.users" collection
    $collection = $client->test->users;
    $collection = $client->selectCollection('test', 'users');
    // List databases on the server
    foreach ($client->listDatabases() as $databaseInfo) {
    echo $databaseInfo->getName(), "\n";
    }

    View Slide

  28. THE MONGODB\DATABASE CLASS
    THE MONGODB\DATABASE CLASS
    $db = $client->test;
    // Select the "test.users" collection
    $collection = $db->users;
    $collection = $db->selectCollection('users');
    // Execute a command on the "test" database
    $cursor = $db->command(['serverStatus' => 1]);
    $serverStatus = $cursor->toArray()[0];
    // List collections in the "test" database
    foreach ($db->listCollections() as $collectionInfo) {
    echo $collectionInfo->getName(), "\n";
    }

    View Slide

  29. THE MONGODB\COLLECTION CLASS
    THE MONGODB\COLLECTION CLASS
    $collection = $db->users;
    // Insert a user document for Bob
    $collection->insertOne(['username' => 'bob', 'roles' => ['admin']]);
    // Retrieve Bob's user document
    $user = $collection->findOne(['username' => 'bob']);
    // Find all admins (returns an iterable MongoDB\Driver\Cursor)
    $cursor = $collection->find(['roles' => 'admin']);
    $admins = $cursor->toArray();

    View Slide

  30. THE MONGODB\COLLECTION CLASS
    THE MONGODB\COLLECTION CLASS
    $collection->count(); // Count all users
    $collection->count(['roles' => 'admin']); // Count only admins
    $collection->deleteOne(['username' => 'john']); // Delete a user
    $collection->deleteMany([]); // Delete all users

    View Slide

  31. IDENTIFIERS AND TYPES
    IDENTIFIERS AND TYPES

    View Slide

  32. IDENTIFIERS
    IDENTIFIERS
    // Create a user document for Tom
    $result = $collection->insertOne(['username' => 'tom']);
    var_dump($result->getInsertedCount());
    var_dump($result->getInsertedId());
    int(1)
    object(MongoDB\BSON\ObjectID)#11 (1) {
    ["oid"]=>
    string(24) "57f7d220da14d803b94fba92"
    }

    View Slide

  33. THE
    THE _id
    _id FIELD
    FIELD
    Its value is immutable
    Unique to the collection
    ObjectID, scalar or object
    Indexed by default

    View Slide

  34. WHAT'S IN AN OBJECTID?
    WHAT'S IN AN OBJECTID?
    57f7d220 da14d8 03b9 4fba92
    Timestamp Hostname PID Sequence
    12-byte, binary string
    Safely generated in distributed environments
    Timestamp prefix useful for sorting

    View Slide

  35. BSON TYPE CLASSES
    BSON TYPE CLASSES
    MongoDB\BSON\Binary
    MongoDB\BSON\Javascript
    MongoDB\BSON\ObjectID
    MongoDB\BSON\Regex
    MongoDB\BSON\Timestamp
    MongoDB\BSON\UTCDateTime
    MongoDB\Model\BSONArray
    MongoDB\Model\BSONDocument

    View Slide

  36. WRITING
    WRITING

    View Slide

  37. WRITING TO MONGODB
    WRITING TO MONGODB
    Inserts, updates, and upserts
    Consistency/performance options
    Avoid shooting yourself in the foot

    View Slide

  38. WRITE CONCERN
    WRITE CONCERN
    {w: 0}
    Fire and forget
    Trade consistency for performance
    {w: 1}
    Ensure primary acknowledges write
    {w: #} or {w: "majority"}
    Wait for multiple acknowledgements

    View Slide

  39. ACKNOWLEDGED WRITES
    ACKNOWLEDGED WRITES
    $collection = (new MongoDB\Client)->test->users;
    $user = $collection->findOne(['username' => 'tom']);
    // MongoDB\Client's default behavior
    $result = $collection->insertOne(
    $user,
    ['writeConcern' => new MongoDB\Driver\WriteConcern(1)]
    );
    var_dump($result->getInsertedCount());
    Uncaught MongoDB\Driver\Exception\BulkWriteException:
    E11000 duplicate key error
    collection: test.users
    index: _id_
    dup key: { : ObjectId('57f7d220da14d803b94fba92') }

    View Slide

  40. ACKNOWLEDGED WRITES
    ACKNOWLEDGED WRITES
    // Connect to a replica set URI
    $collection = (new MongoDB\Client('...'))->test->users;
    $result = $collection->insertOne(
    ['username' => 'tom'],
    ['writeConcern' => new MongoDB\Driver\WriteConcern(2)]
    );
    var_dump($result->getInsertedCount()); // int(1)

    View Slide

  41. UNACKNOWLEDGED WRITES
    UNACKNOWLEDGED WRITES
    $collection = (new MongoDB\Client)->test->users;
    $user = $collection->findOne(['username' => 'tom']);
    $result = $collection->insertOne(
    $user,
    ['writeConcern' => new MongoDB\Driver\WriteConcern(0)]
    );
    var_dump($result->isAcknowledged());
    bool(false)

    View Slide

  42. UPDATING A DOCUMENT
    UPDATING A DOCUMENT
    // Change Tom's email address
    $result = $collection->updateOne(
    ['username' => 'tom'],
    ['$set' => ['email' => '[email protected]']]
    );
    var_dump($result->getMatchedCount()); // int(1)
    var_dump($result->getModifiedCount()); // int(1)
    Criteria
    New object
    Overwrite entire document
    Apply atomic operations

    View Slide

  43. UPDATING MULTIPLE DOCUMENTS
    UPDATING MULTIPLE DOCUMENTS
    // Make everyone an admin (probably a bad idea :)
    $result = $collection->updateMany(
    [],
    ['$addToSet' => ['roles' => 'admin']]
    );
    var_dump($result->getMatchedCount()); // int(9)
    var_dump($result->getModifiedCount()); // int(9)

    View Slide

  44. UPSERTING
    UPSERTING
    // If Sam doesn't exist, insert: { username: "Sam", role: "staff" }
    $result = $collection->updateOne(
    ['username' => 'sam'],
    ['$set' => ['role' => 'staff']],
    ['upsert' => true]
    );
    var_dump($result->getMatchedCount()); // int(0)
    var_dump($result->getModifiedCount()); // int(0)
    var_dump($result->getUpsertedCount()); // int(1)
    var_dump($result->getUpsertedId()); // object(MongoDB\BSON\ObjectID)
    No multi-document operation
    Does the criteria match a document?
    Yes, update the document
    No, apply modifiers to criteria and insert

    View Slide

  45. UPDATE OPERATORS
    UPDATE OPERATORS
    Mixed Arrays
    ,
    ,
    Numbers
    $currentDate
    $rename
    $set
    $unset
    $addToSet
    $pop
    $pull $pullAll
    $push $pushAll
    $bit
    $inc
    $mul

    View Slide

  46. POSITIONAL UPDATES
    POSITIONAL UPDATES
    $collection = (new MongoDB\Client)->test->articles;
    // An article with votable comments
    $collection->insertOne([
    '_id' => 1,
    'title' => 'Praesent ante dui',
    'comments' => [
    ['id' => 1, 'votes' => 2, 'text' => 'Dapibus quis…'],
    ['id' => 2, 'votes' => 0, 'text' => 'Fusce fermentum…'],
    ],
    ]);
    // Upvote the second comment
    $collection->updateOne(
    ['_id' => 1, 'comments.id' => 2],
    ['$inc' => ['comments.$.votes' => 1]]
    );

    View Slide

  47. QUERIES
    QUERIES

    View Slide

  48. BASIC QUERYING
    BASIC QUERYING
    $collection->findOne(); // Retrieve one document as an array
    $collection->find(); // Find all documents via MongoDB\Driver\Cursor
    // Query on field values
    $collection->findOne(['lastName' => 'Smith']);
    $collection->find(['roles' => 'admin']);
    Query criteria is BSON
    No grammar to parse

    View Slide

  49. QUERIES RETURN CURSORS
    QUERIES RETURN CURSORS
    Cursors navigate a query result
    Can be iterated forward and only once
    Convert to an array for random access
    Commands return cursors, too

    View Slide

  50. THE MONGODB\DRIVER\CURSOR CLASS
    THE MONGODB\DRIVER\CURSOR CLASS
    $cursor = $collection->find(
    ['roles' => 'admin'], // Find all admins
    [
    'sort' => ['username' => -1], // Desc sort by username
    'limit' => 10, // Limit to 10 results
    'skip' => 5, // Skip the first 5
    ]
    );
    // Iterate through results
    foreach ($cursor as $document) { }
    // Get them all at once (equivalent to iterator_to_array())
    $results = $cursor->toArray();

    View Slide

  51. AD-HOC QUERYING
    AD-HOC QUERYING
    ARBITRARY BUSINESS REQUIREMENT #42789!
    ARBITRARY BUSINESS REQUIREMENT #42789!
    We need the usernames for all admins that use Gmail and whose
    accounts were created within the last year.

    View Slide

  52. DATA TO QUERY
    DATA TO QUERY
    $collection = (new MongoDB\Client)->test->users;
    $collection->insertOne([
    'username' => 'bob',
    'email' => '[email protected]',
    'profile' => [
    'bio' => 'I am a data fixture.',
    'createdAt' => new MongoDB\BSON\UTCDateTime,
    ],
    'roles' => ['moderator', 'admin'],
    ]);
    // Among others…

    View Slide

  53. COMPLEX CRITERIA
    COMPLEX CRITERIA
    Regular expressions
    Matching values in embedded objects
    Comparison operators
    Matching a value in an array

    View Slide

  54. COMPLEX CRITERIA
    COMPLEX CRITERIA
    $lastYear = new DateTime('last year');
    // Arbitrary Business Requirement #42789!
    $cursor = $collection->find(
    [
    'email' => new MongoDB\BSON\Regex('gmail\.com$', 'i'),
    'profile.createdAt' => [
    '$gt' => new MongoDB\BSON\UTCDateTime($lastYear),
    ],
    'roles' => 'admin',
    ]
    );

    View Slide

  55. FIELD PROJECTION
    FIELD PROJECTION
    // Limit returned fields to the username
    $cursor = $collection->find(
    [
    'email' => new MongoDB\BSON\Regex('gmail\.com$', 'i'),
    'profile.createdAt' => [
    '$gt' => new MongoDB\BSON\UTCDateTime($lastYear),
    ],
    'roles' => 'admin',
    ],
    [
    'projection' => ['username' => 1],
    ]
    );
    Retrieving a subset of fields
    Retrieving a slice of an array

    View Slide

  56. QUERY OPERATORS
    QUERY OPERATORS
    Comparison
    ,
    ,
    Logical
    ,
    Arrays
    ,
    Misc Geo
    $gt $gte
    $lt $lte
    $ne
    $and
    $or $nor
    $not
    $all
    $in $nin
    $size
    $elemMatch
    $exists
    $mod
    $text
    $type
    $where
    $geoWithin
    $geoIntersects
    $near
    $nearSphere

    View Slide

  57. AN ATOMIC PICKLE
    AN ATOMIC PICKLE
    We can query for things
    We can update documents atomically
    Can we atomically query and update?

    View Slide

  58. DATA TO PROCESS
    DATA TO PROCESS
    $collection = (new MongoDB\Client)->test->jobs;
    $collection->insertOne([
    'task' => '…',
    'inprogress' => 'false',
    'priority' => 1
    ]);
    $collection->insertOne([
    'task' => '…',
    'inprogress' => 'false',
    'priority' => 4
    ]);

    View Slide

  59. A DOCUMENT
    A DOCUMENT
    FIND AND MODIFY
    FIND AND MODIFY
    use MongoDB\Operation\FindOneAndUpdate;
    $collection->findOneAndUpdate(
    ['inprogress' => false],
    ['$set' => [
    'inprogress' => true,
    'startedAt' => new MongoDB\BSON\UTCDateTime,
    ]],
    [
    'returnDocument' => FindOneAndUpdate::RETURN_DOCUMENT_AFTER,
    'sort' => ['priority' => -1],
    ]
    );
    Atomically select and modify one document
    Update, upsert, replace, or delete the document
    Return document in pre- or post-modified state

    View Slide

  60. INDEXING
    INDEXING

    View Slide

  61. INDEXES IN MONGODB
    INDEXES IN MONGODB
    B-trees
    Multiple indexes per collection
    Any field(s), top-level or embedded
    , , , , and
    Multi-key indexes for array values
    Sparse unique geospatial text TTL

    View Slide

  62. MANAGING INDEXES
    MANAGING INDEXES
    $collection = (new MongoDB\Client)->test->things;
    // Ensure unique values for x (just like the _id index)
    $collection->createIndex(['x' => 1], ['unique' => true]);
    // Delegate index creation as a background task
    $collection->createIndex(['x' => 1], ['background' => true]);
    // List all indexes in the "test.things" collection
    foreach ($collection->listIndexes() as $indexInfo) {
    echo $indexInfo->getName(), "\n";
    }
    $collection->dropIndex('x_1'); // Delete an index by name
    $collection->dropIndexes(); // Delete all indexes for the collection

    View Slide

  63. COMPOUND INDEXES
    COMPOUND INDEXES
    $collection->createIndex(['x' => 1, 'y' => 1, 'z' => -1]);
    // These queries will use the index
    $collection->find(['x' => 'foo']);
    $collection->find(['x' => 'foo', 'y' => ['$gt' => 5]]);
    $collection->find(
    ['x' => 'foo', 'y' => 4],
    ['sort' => ['z' => -1]]
    );
    $collection->find(['x' => "foo", 'y' => 8, 'z' => 'baz']);
    // This query will not by default
    $collection->find(['y' => 6]);
    Index multiple fields
    Direction per field (range queries, sorting)
    Usable for constituent field queries

    View Slide

  64. PERFORMANCE TIPS
    PERFORMANCE TIPS
    Avoid non-indexed queries and table scans
    Keep your indexes and working set in RAM
    Mind your read/write ratio
    Support multiple queries per index
    Create indexes that ensure selectivity
    Indexing advice and FAQ

    View Slide

  65. DATABASE COMMANDS
    DATABASE COMMANDS

    View Slide

  66. DATABASE COMMANDS
    DATABASE COMMANDS
    Sharding Replication Aggregation
    Collections Indexing Geospatial
    Diagnostic Administration
    Shell and driver helpers for some
    for the rest
    MongoDB\Database::command()
    Reference list

    View Slide

  67. AGGREGATION
    AGGREGATION

    View Slide

  68. AGGREGATION
    AGGREGATION
    Aggregation framework
    Single-purpose aggregation commands
    MapReduce
    Hadoop adapter

    View Slide

  69. REPORTING
    REPORTING
    ARBITRARY BUSINESS REQUIREMENT #27533!
    ARBITRARY BUSINESS REQUIREMENT #27533!
    We need a report listing all authors that have written an article
    for each tag category.

    View Slide

  70. DATA TO AGGREGATE
    DATA TO AGGREGATE
    $collection = (new MongoDB\Client)->test->articles;
    $collection->insertOne(['author' => 'jen', 'tags' => ['politics', 'tech']]);
    $collection->insertOne(['author' => 'sue', 'tags' => ['business']]);
    $collection->insertOne(['author' => 'tom', 'tags' => ['sports']]);
    $collection->insertOne([
    'author' => 'bob',
    'tags' => ['business', 'sports', 'tech']
    ]);

    View Slide

  71. MAP
    MAPREDUCE
    REDUCE
    $map = '
    function() {
    for (var i = 0; i < this.tags.length; i++) {
    emit(this.tags[i], { authors: [this.author] });
    }
    }';

    View Slide

  72. REDUCE
    REDUCE
    MAP
    MAP
    $reduce = '
    function(key, values) {
    var result = { authors: [] };
    values.forEach(function(value) {
    value.authors.forEach(function(author) {
    if (-1 == result.authors.indexOf(author)) {
    result.authors.push(author);
    }
    });
    });
    return result;
    }';

    View Slide

  73. MAPREDUCE
    MAPREDUCE
    $db = (new MongoDB\Client)->test;
    $cursor = $db->command([
    'mapreduce' => 'articles',
    'map' => new MongoDB\BSON\Javascript($map),
    'reduce' => new MongoDB\BSON\Javascript($reduce),
    'out' => ['inline' => true],
    ]);
    foreach ($cursor->toArray()[0]['results'] as $r) {
    $authors = implode(', ', $r['value']['authors']);
    printf("%s: %s\n", $r['_id'], $authors);
    }
    business: sue, bob
    politics: jen
    sports: tom, bob
    tech: jen, bob

    View Slide

  74. AGGREGATION FRAMEWORK
    AGGREGATION FRAMEWORK
    $collection = (new MongoDB\Client)->test->articles;
    $cursor = $collection->aggregate([
    ['$unwind' => '$tags'],
    ['$group' => [
    '_id' => '$tags',
    'authors' => ['$addToSet' => '$author'],
    ]],
    ]);
    foreach ($cursor as $r) {
    $authors = implode(', ', $r['authors']);
    printf("%s: %s\n", $r['_id'], $authors);
    }
    business: bob, sue
    tech: bob, jen
    sports: bob, tom
    politics: jen

    View Slide

  75. GRIDFS
    GRIDFS

    View Slide

  76. GRIDFS
    GRIDFS
    File storage within MongoDB
    BSON collections
    fs.files: metadata document
    fs.chunks: binary data (1+ per file)
    MongoDB\GridFS\Bucket and PHP Streams

    View Slide

  77. THE MONGODB\GRIDFS\BUCKET CLASS
    THE MONGODB\GRIDFS\BUCKET CLASS
    $db = (new MongoDB\Client)->test;
    // Default prefix (fs.files and fs.chunks)
    $bucket = $db->selectGridFSBucket();
    // Custom prefix (images.files and images.chunks)
    $bucket = $db->selectGridFSBucket(['bucketName' => 'images']);

    View Slide

  78. STORING FILES
    STORING FILES
    // Upload a file and return its _id
    $source = fopen('/path/to/file', 'rb');
    $id = $bucket->uploadFromStream('filename', $source);
    // Upload bytes to a file and return its _id
    $stream = $bucket->openUploadStream('filename');
    $id = $bucket->getFileIdForStream($stream);
    fwrite($stream, 'foobar');
    fclose($stream);

    View Slide

  79. RETRIEVING FILES
    RETRIEVING FILES
    // Download a file by its _id
    $id = new MongoDB\BSON\ObjectID('57f81b38da14d802e97b1b72');
    $destination = fopen('/path/to/file', 'w+b');
    $bucket->downloadToStream($id, $destination);
    // Open a readable stream for a file by its _id
    $stream = $bucket->openDownloadStream($id);
    $contents = stream_get_contents($stream);

    View Slide

  80. RETRIEVING FILES BY FILENAME
    RETRIEVING FILES BY FILENAME
    // Download a file by its filename
    $destination = fopen('/path/to/file', 'w+b');
    $bucket->downloadToStreamByName('filename', $destination);
    // Open a readable stream for a file by its filename
    $stream = $bucket->openDownloadStreamByName('filename');
    $contents = stream_get_contents($stream);
    filename is not necessarily unique
    Same filename and different uploadDate → revision
    Filename queries select the most recent revision by default

    View Slide

  81. RETRIEVING FILES BY REVISION
    RETRIEVING FILES BY REVISION
    // Open the most recent revision (default behavior)
    $bucket->openDownloadStreamByName('filename', ['revision' => -1]);
    // Open the second most recent revision
    $bucket->openDownloadStreamByName('filename', ['revision' => -2]);
    // Open the original revision (i.e. 0th)
    $bucket->openDownloadStreamByName('filename', ['revision' => 0]);
    // Open the first revision
    $bucket->openDownloadStreamByName('filename', ['revision' => 1]);
    Same options for downloadToStreamByName()

    View Slide

  82. SELECTING A FILE DOCUMENT
    SELECTING A FILE DOCUMENT
    $id = new MongoDB\BSON\ObjectID('57f81b38da14d802e97b1b72');
    $stream = $bucket->openDownloadStream($id);
    var_dump($bucket->getFileDocumentForStream($stream));
    object(stdClass)#16 (6) {
    ["_id"]=>
    object(MongoDB\BSON\ObjectID)#18 (1) {
    ["oid"]=>
    string(24) "57f8228eda14d81806017242"
    }
    ["chunkSize"]=>
    int(261120)
    ["filename"]=>
    string(8) "filename"
    ["uploadDate"]=>
    object(MongoDB\BSON\UTCDateTime)#21 (1) {
    ["milliseconds"]=>
    string(13) "1475879566662"
    }
    ["length"]=>
    int(9)
    ["md5"]=>

    View Slide

  83. DELETING FILES
    DELETING FILES
    // Delete a file (and related chunks) by its _id
    $id = new MongoDB\BSON\ObjectID('57f81b38da14d802e97b1b72');
    $bucket->delete($id);
    // Drop the files and chunks collections
    $bucket->drop();
    No API for modifying files in-place
    Delete and re-upload, or create a new revision

    View Slide

  84. THANKS!
    THANKS!
    and
    (free online education)
    (DBaaS)
    Server downloads documentation
    PHP and HHVM extension documentation
    PHP library documentation
    MongoDB University
    MongoDB Atlas
    QUESTIONS?
    QUESTIONS?

    View Slide

  85. PHOTO CREDITS
    PHOTO CREDITS
    http://www.flickr.com/photos/paranoidfs/7312644738/
    http://www.flickr.com/photos/pagedooley/7330663394/
    http://www.slxdeveloper.com/page.aspx?action=viewarticle&articleid=92
    http://www.sxc.hu/photo/487816
    http://www.flickr.com/photos/sharynmorrow/2224102244/
    http://www.flickr.com/photos/guangye/3234191654/
    http://www.flickr.com/photos/spacebarpark/6712082295/
    http://www.flickr.com/photos/dierken/7998425310/
    http://www.flickr.com/photos/[email protected]/7415476606/
    http://www.sxc.hu/photo/443042
    http://www.flickr.com/photos/mikebaird/3898801499/
    http://history.nasa.gov/ap08fj/photos/a/as08-16-2593hr.jpg

    View Slide