Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Bulletproof MongoDB

Bulletproof MongoDB

Presented December 6, 2018 at SymfonyCon Lisbon.

Jeremy Mikola

December 06, 2018

More Decks by Jeremy Mikola

Other Decks in Programming


  1. Some Topics to Cover • Deployment models • Driver internals

    • Driver configuration • Application concepts • Retrying operations • Transactions
  2. Starting with a connection string mongodb://user:[email protected]:27017/?replicaSet=rs0 Connection String spec instructs

    how to parse this to yield: host identifiers, authentication credentials, connection options A mongodb+srv:// scheme indicates Initial DNS Seedlist Discovery, which may yield additional host identifiers Atlas uses this to provide shorter, more resilient connection strings
  3. The first handshake Drivers issue an isMaster command on all

    newly established connections This uses OP_QUERY instead of OP_MSG for backwards compatibility Drivers can also provide client metadata The isMaster response reports the server’s min and max wire versions Used for protocol negotiation, feature discovery, detecting imposters No authentication or compression at this step
  4. Authentication and compression After the handshake, drivers know what auth

    and compression protocols (if any) are supported by the server Drivers also advertise what compression they support in client metadata Auth spec defines command conversations for various auth mechanisms Compression spec defines OP_COMPRESSED as an envelope for other opcodes Compression is never used for certain commands (e.g. isMaster, auth)
  5. Server discovery and monitoring SDAM defines structures for topology and

    server descriptions, a strategy for periodic monitoring, and a state machine for updating descriptions Drivers can infer initial topology type and servers from the connection string Unknown types address ambiguity (e.g. seed list without replicaSet option) isMaster response affirms a server’s type and may also update the topology
  6. Single-threaded applications have many app servers, each with a pool

    of workers, each responsible for serving one request at a time Different application deployments App Server Cluster Cluster Multi-threaded and async applications have a limited number of app servers responsible for serving incoming requests concurrently App Server App Server App Server App Server App Server App Server
  7. Different approaches to monitoring Multi-threaded and asynchronous drivers monitor the

    topology in a background “thread” and maintain a separate connection pool for application usage Monitoring thread does not share sockets with the connection pool (rationale) Single-threaded drivers share sockets for monitoring and application usage and perform monitoring during server selection (i.e. procuring a socket) Separate sockets would be redundant and/or costly Forgo connection pools for persistent sockets
  8. Robustness improvements to monitoring Use connectTimeoutMS in lieu of socketTimeoutMS

    Retry isMaster once to quickly recover dropped sockets (rationale) Drivers internally invoke monitoring as needed (e.g. after “not master” error) Optimizations for single-threaded drivers Ignore inaccessible servers for cooldownMS (five seconds) Monitoring can be parallelized with async IO
  9. Server selection Relies on SDAM for an up-to-date view of

    the topology and its servers Server Selection uses a loop to filter the topology to a server description Straightforward algorithm for multi-threaded and async drivers, but single-threaded drivers must invoke SDAM during the loop Random selection within a latency window if multiple servers are eligible A server description can be exchanged for a socket
  10. How this fits in with PHP <?php require_once 'vendor/autoload.php'; $client

    = new MongoDB\Client; $collection = $client->test->foo; $collection->drop(); $collection->insertOne(['hello' => 'world']); $cursor = $collection->find(); foreach ($cursor as $document) { var_dump($document); } object(MongoDB\Model\BSONDocument)#4 (1) { ["storage":"ArrayObject":private]=> array(2) { ["_id"]=> object(MongoDB\BSON\ObjectId)#18 (1) { ["oid"]=> string(24) "5c07e4822dbd7b79db17f192" } ["hello"]=> string(5) "world" } }
  11. URI is parsed during Client construction class Client { public

    function __construct($uri = 'mongodb://', array $uriOptions = [], array $driverOptions = []) { $this->manager = new Manager($uri, $uriOptions, $driverOptions); } }
  12. Server selection initializes SDAM class Collection { public function drop(array

    $options = []) { $server = $this->manager->selectServer(new ReadPreference('primary')); $operation = new DropCollection($this->databaseName, $this->collectionName, $options); return $operation->execute($server); } }
  13. connectTimeoutMS is the timeout for initial socket connections and internal

    monitoring activity (defaults to 10 seconds) Consider tuning closer to the expected max latency of the database servers socketTimeoutMS pertains to application operations (defaults to 300 seconds) Comparable to PHP’s own default_socket_timeout. Be mindful of PHP’s max_execution_time. Configuring socket timeouts
  14. heartbeatFrequencyMS is the monitoring interval (defaults to 60 seconds for

    single-threaded drivers; minimum is 500ms) socketCheckIntervalMS determines if a socket is considered inactive and must be re-checked before use (defaults to 5 seconds) Specifically for single-threaded drivers. Like retrying isMaster, this helps insulate applications from network errors. Configuring monitoring
  15. Configuring server selection localThresholdMS defines the latency window for selecting

    an eligible server (defaults to 15ms) serverSelectionTimeoutMS is maximum amount of time to spend in the server selection loop (defaults to 30 seconds) serverSelectionTryOnce allows the application to “fail fast” Specifically for single-threaded drivers, where this defaults to true Disabling try-once behavior can improve resiliency at the expense of time
  16. Write Concern w controls acknowledgement behavior majority scales with the

    size of the replica set majority and journaling collectively guarantee durability and avoid data loss due to roll backs
  17. Read Concern local and available are most permissive majority guarantees

    that the data has been acknowledged by a majority linearizable provides additional guarantees over majority to avoid returning stale data. Introduced in MongoDB 3.4 to satisfy the Jepsen test framework. Peter Bailis provides an accessible definition of linearizability snapshot may be used with majority-committed transactions to guarantee that reads within that transaction use a snapshot of majority-committed data
  18. We haven’t got all day Operations can limit their execution

    time with maxTimeMS Server will track processing time and abort at the next interrupt point Socket timeouts can be expensive for both the client and server Write concerns can also use wtimeout to limit waiting time for replication Distinguish write concern errors from write errors
  19. Causal Consistency Casual relationship when an operation logically depends on

    a preceding operation Causal consistency comes with several guarantees Read your own writes, monotonic reads/writes, and writes follow reads Satisfied by majority read and write concerns (when durability required) Applications can obtain causal consistency by using explicit sessions (examples)
  20. Logical Sessions Sessions maintain cluster-wide state about the user and

    their operations In earlier versions of MongoDB, state was tied to connection objects Sessions live throughout a cluster and are not tied to connection objects Sessions can be created and used as an explicit option for database operations Group operations by passing the same session (e.g. causal consistency) By default, drivers will use an implicit session for single operations
  21. Don’t do this function retry(Closure $retry, $numRetries = 1) {

    if ($numRetries < 1) { return $retry(); } for ($i = 0; $i <= $numRetries; $i++) { try { return $retry(); } catch (MongoDB\Driver\Exception $e) { if ($i === $numRetries) { throw $e; } } } }
  22. What’s the problem with retrying? Operations can change the state

    of the system Reads or writes may continue to run on the server after the client moves on. Write operations may not be idempotent and safe to execute multiple times. At best, retrying may waste time or consume resources At worst, retrying may inadvertently alter the data itself
  23. Know your errors Any retry strategy should consider the kind

    of failure Transient network error, persistent outage, command error A retry attempt may be necessary to differentiate transience from persistence If a command response reports failure, retrying probably isn’t going to help
  24. Retryable errors Any network error (e.g. socket timeout, dropped connection)

    Server response clearly indicating a transient error (e.g. “not master”) Most commonly caused by a replica set failover or step down
  25. Retrying reads Queries that return a single document are always

    safe to retry Short-running queries that return a single batch of documents (i.e. will not leave behind a cursor) may be safe to retry Drivers will aim to retry most read commands in MongoDB 4.2 Requires server functionality to detect dropped sockets and abort operations getMore cannot be retried, since cursor iteration is forward only
  26. Retrying writes Given that: • Sessions are cluster-wide and exist

    beyond the scope of a connection • Each write can be uniquely identified by a session and statement ID • Drivers can rely on SDAM and server selection to re-select the primary Drivers can safely retry single-document writes (or bulks thereof) by resending the original command to the primary and trusting the server to Do the Right Thing™ If the write already executed, return the result we missed If the write never executed, do it now and return its result
  27. Retrying wants a server selection loop Drivers invoke server selection

    for each retry attempt PHP’s default try-once behavior is unlikely to find a new primary after a failover, since replica set elections can take a few seconds (electionTimeoutMillis) Reducing election times for planned maintenance (SERVER-35624) Combining retryWrites=true with serverSelectionTryOnce=false can fully insulate an application’s writes from replica set failovers (https://git.io/fNbW0)
  28. Taking advantage of retryable writes Add retryWrites=true to your connection

    string, disable try-once behavior (serverSelectionTryOnce=false), and tune serverSelectionTimeoutMS closer to expected election time (e.g. 15 seconds) Atlas already advises this, which helps with its automated maintenance Use the driver as you would normally Multi-document writes (e.g. updateMany) may still fail; you’re no worse off Single-document writes may still fail after one retry attempt
  29. Getting to this point MongoDB 3.0 introduced the WiredTiger storage

    engine MongoDB 3.2 made WiredTiger the default, introduced read concerns, and made significant improvements to the replication protocol MongoDB 3.6 introduced logical sessions, which was the underlying framework for causal consistency and retryable writes MongoDB 4.0 introduced multi-document transactions for replica sets by leveraging the logical session API and WiredTiger storage engine MongoDB 4.2 will add transaction support for sharded clusters
  30. Transactions at a glance All operations within a transaction must

    route to the same member (i.e. primary) Read and write concerns are specified once, when starting a transaction While many operations are supported, there are some restrictions (e.g. DDL) Databases and collections must exist prior to starting the transaction Cursors created outside a transaction cannot be used within, and vice versa
  31. Transactions in PHP <?php require_once 'vendor/autoload.php'; $client = new MongoDB\Client;

    $session = $client->startSession(); $session->startTransaction(); $client->test->foo->insertOne(['x' => 1], ['session' => $session]); $client->test->bar->insertOne(['y' => 2], ['session' => $session]); $session->commitTransaction();
  32. Retrying transactions Drivers automatically retry commit and abort commands once

    Applications can retry commits additional times if desired Other read and write operations are not retried. Transactions and retryable writes (i.e. retryWrites=true) are mutually exclusive. Entire transactions may be retried if an operation fails with a transient error Use a majority write concern when retrying transactions for durability
  33. Knowing when to retry transactions Any RuntimeException thrown by the

    driver or library may be associated with one or more error labels, which can be checked using the hasErrorLabel() method TransientTransactionError implies the entire transaction can be retried UnknownTransactionCommitResult implies a commit can be retried Applications can, and should, handle both cases (example)
  34. Resources and Further Reading MongoDB PHP driver documentation and specifications

    https://php.net/mongodb https://docs.mongodb.com/php-library/ https://github.com/mongodb/specifications MongoDB Manual (CRUD concepts, retryable writes, transactions) https://docs.mongodb.com/manual/core/crud/ https://docs.mongodb.com/manual/core/retryable-writes/ https://docs.mongodb.com/manual/core/transactions/ How to Write Resilient MongoDB Applications — A. Jesse Jiryu Davis https://emptysqua.re/blog/how-to-write-resilient-mongodb-applications/ It’s 10pm: Do You Know Where Your Writes Are? — Jeremy Mikola https://speakerdeck.com/jmikola/its-10pm-do-you-know-where-your-writes-are
  35. Image Credits • https://docs.mongodb.com/ • https://twitter.com/dcousineau/status/613127314545737728 • https://imgur.com/gallery/B58uJxA • http://www.kollected.com/Mars-Rover-Curiosity

    • https://scryfall.com/card/ugl/2/the-cheese-stands-alone • https://www.reddit.com/r/thinkpad/comments/8lzftd/exploded_x220_wallpaper_with_a_slightly_different/ • https://skitterphoto.com/photos/232/mixer-knobs • https://www.gsb.stanford.edu/insights/end-traffic-jams-it-might-not-be-dream • https://pixabay.com/en/blueprint-ruler-architecture-964630/ • https://obrazki.elektroda.pl/5336891800_1520705708.jpg • https://www.youtube.com/watch?v=IKiSPUc2Jck • http://markinternational.info/coding-hd-wallpaper/222275975.html