$30 off During Our Annual Pro Sale. View Details »

Bulletproof MongoDB

Bulletproof MongoDB

Presented December 6, 2018 at SymfonyCon Lisbon.

Jeremy Mikola

December 06, 2018
Tweet

More Decks by Jeremy Mikola

Other Decks in Programming

Transcript

  1. Bulletproof MongoDB
    Jeremy Mikola
    @jmikola

    View Slide

  2. A Little About Myself

    View Slide

  3. Shots Fired

    View Slide

  4. Some Topics to Cover
    ● Deployment models
    ● Driver internals
    ● Driver configuration
    ● Application concepts
    ● Retrying operations
    ● Transactions

    View Slide

  5. Deployment Models
    High availability and/or horizontal scaling

    View Slide

  6. Standalone
    mongod

    View Slide

  7. Replication

    View Slide

  8. Replication: Heartbeats

    View Slide

  9. Replication: Failover

    View Slide

  10. Sharding

    View Slide

  11. Sharding

    View Slide

  12. View Slide

  13. Sharding: Routing

    View Slide

  14. Driver Internals
    From connection strings to monitoring

    View Slide

  15. Starting with a connection string
    mongodb://user:[email protected]:27017/?replicaSet=rs0
    Connection String spec instructs how to parse this to yield:
    host identifiers, authentication credentials, connection options
    A mongodb+srv:// scheme indicates Initial DNS Seedlist Discovery, which may
    yield additional host identifiers
    Atlas uses this to provide shorter, more resilient connection strings

    View Slide

  16. The first handshake
    Drivers issue an isMaster command on all newly established connections
    This uses OP_QUERY instead of OP_MSG for backwards compatibility
    Drivers can also provide client metadata
    The isMaster response reports the server’s min and max wire versions
    Used for protocol negotiation, feature discovery, detecting imposters
    No authentication or compression at this step

    View Slide

  17. Authentication and compression
    After the handshake, drivers know what auth and compression protocols (if any)
    are supported by the server
    Drivers also advertise what compression they support in client metadata
    Auth spec defines command conversations for various auth mechanisms
    Compression spec defines OP_COMPRESSED as an envelope for other opcodes
    Compression is never used for certain commands (e.g. isMaster, auth)

    View Slide

  18. Server discovery and monitoring
    SDAM defines structures for topology and server descriptions, a strategy for
    periodic monitoring, and a state machine for updating descriptions
    Drivers can infer initial topology type and servers from the connection string
    Unknown types address ambiguity (e.g. seed list without replicaSet option)
    isMaster response affirms a server’s type and may also update the topology

    View Slide

  19. Single-threaded applications have many app
    servers, each with a pool of workers, each
    responsible for serving one request at a time
    Different application deployments
    App Server
    Cluster Cluster
    Multi-threaded and async applications have a
    limited number of app servers responsible for
    serving incoming requests concurrently
    App Server App Server
    App Server
    App Server
    App Server
    App Server

    View Slide

  20. Different approaches to monitoring
    Multi-threaded and asynchronous drivers monitor the topology in a background
    “thread” and maintain a separate connection pool for application usage
    Monitoring thread does not share sockets with the connection pool (rationale)
    Single-threaded drivers share sockets for monitoring and application usage and
    perform monitoring during server selection (i.e. procuring a socket)
    Separate sockets would be redundant and/or costly
    Forgo connection pools for persistent sockets

    View Slide

  21. Robustness improvements to monitoring
    Use connectTimeoutMS in lieu of socketTimeoutMS
    Retry isMaster once to quickly recover dropped sockets (rationale)
    Drivers internally invoke monitoring as needed (e.g. after “not master” error)
    Optimizations for single-threaded drivers
    Ignore inaccessible servers for cooldownMS (five seconds)
    Monitoring can be parallelized with async IO

    View Slide

  22. Server selection
    Relies on SDAM for an up-to-date view of the topology and its servers
    Server Selection uses a loop to filter the topology to a server description
    Straightforward algorithm for multi-threaded and async drivers, but
    single-threaded drivers must invoke SDAM during the loop
    Random selection within a latency window if multiple servers are eligible
    A server description can be exchanged for a socket

    View Slide

  23. How this fits in with PHP
    require_once 'vendor/autoload.php';
    $client = new MongoDB\Client;
    $collection = $client->test->foo;
    $collection->drop();
    $collection->insertOne(['hello' => 'world']);
    $cursor = $collection->find();
    foreach ($cursor as $document) {
    var_dump($document);
    }
    object(MongoDB\Model\BSONDocument)#4 (1) {
    ["storage":"ArrayObject":private]=>
    array(2) {
    ["_id"]=>
    object(MongoDB\BSON\ObjectId)#18 (1) {
    ["oid"]=>
    string(24) "5c07e4822dbd7b79db17f192"
    }
    ["hello"]=>
    string(5) "world"
    }
    }

    View Slide

  24. URI is parsed during Client construction
    class Client
    {
    public function __construct($uri = 'mongodb://127.0.0.1/',
    array $uriOptions = [],
    array $driverOptions = [])
    {
    $this->manager = new Manager($uri, $uriOptions, $driverOptions);
    }
    }

    View Slide

  25. Server selection initializes SDAM
    class Collection
    {
    public function drop(array $options = [])
    {
    $server = $this->manager->selectServer(new ReadPreference('primary'));
    $operation = new DropCollection($this->databaseName, $this->collectionName, $options);
    return $operation->execute($server);
    }
    }

    View Slide

  26. Driver Configuration
    You’ve got a few options

    View Slide

  27. connectTimeoutMS is the timeout for initial socket connections and internal
    monitoring activity (defaults to 10 seconds)
    Consider tuning closer to the expected max latency of the database servers
    socketTimeoutMS pertains to application operations (defaults to 300 seconds)
    Comparable to PHP’s own default_socket_timeout.
    Be mindful of PHP’s max_execution_time.
    Configuring socket timeouts

    View Slide

  28. heartbeatFrequencyMS is the monitoring interval (defaults to 60 seconds for
    single-threaded drivers; minimum is 500ms)
    socketCheckIntervalMS determines if a socket is considered inactive and must
    be re-checked before use (defaults to 5 seconds)
    Specifically for single-threaded drivers. Like retrying isMaster, this helps
    insulate applications from network errors.
    Configuring monitoring

    View Slide

  29. Configuring server selection
    localThresholdMS defines the latency window for selecting an eligible server
    (defaults to 15ms)
    serverSelectionTimeoutMS is maximum amount of time to spend in the server
    selection loop (defaults to 30 seconds)
    serverSelectionTryOnce allows the application to “fail fast”
    Specifically for single-threaded drivers, where this defaults to true
    Disabling try-once behavior can improve resiliency at the expense of time

    View Slide

  30. The argument for “fail fast” behavior

    View Slide

  31. Application Concepts

    View Slide

  32. Write Concern
    w controls acknowledgement behavior
    majority scales with the size of the replica set
    majority and journaling collectively guarantee
    durability and avoid data loss due to roll backs

    View Slide

  33. Read Concern
    local and available are most permissive
    majority guarantees that the data has been acknowledged by a majority
    linearizable provides additional guarantees over majority to avoid returning
    stale data. Introduced in MongoDB 3.4 to satisfy the Jepsen test framework.
    Peter Bailis provides an accessible definition of linearizability
    snapshot may be used with majority-committed transactions to guarantee that
    reads within that transaction use a snapshot of majority-committed data

    View Slide

  34. We haven’t got all day
    Operations can limit their execution time with maxTimeMS
    Server will track processing time and abort at the next interrupt point
    Socket timeouts can be expensive for both the client and server
    Write concerns can also use wtimeout to limit waiting time for replication
    Distinguish write concern errors from write errors

    View Slide

  35. Causal Consistency
    Casual relationship when an operation logically depends on a preceding operation
    Causal consistency comes with several guarantees
    Read your own writes, monotonic reads/writes, and writes follow reads
    Satisfied by majority read and write concerns (when durability required)
    Applications can obtain causal consistency by using explicit sessions (examples)

    View Slide

  36. Logical Sessions
    Sessions maintain cluster-wide state about the user and their operations
    In earlier versions of MongoDB, state was tied to connection objects
    Sessions live throughout a cluster and are not tied to connection objects
    Sessions can be created and used as an explicit option for database operations
    Group operations by passing the same session (e.g. causal consistency)
    By default, drivers will use an implicit session for single operations

    View Slide

  37. Abort, Retry, Fail?

    View Slide

  38. Don’t do this
    function retry(Closure $retry, $numRetries = 1)
    {
    if ($numRetries < 1) {
    return $retry();
    }
    for ($i = 0; $i <= $numRetries; $i++) {
    try {
    return $retry();
    } catch (MongoDB\Driver\Exception $e) {
    if ($i === $numRetries) {
    throw $e;
    }
    }
    }
    }

    View Slide

  39. What’s the problem with retrying?
    Operations can change the state of the system
    Reads or writes may continue to run on the server after the client moves on.
    Write operations may not be idempotent and safe to execute multiple times.
    At best, retrying may waste time or consume resources
    At worst, retrying may inadvertently alter the data itself

    View Slide

  40. Know your errors
    Any retry strategy should consider the kind of failure
    Transient network error, persistent outage, command error
    A retry attempt may be necessary to differentiate transience from persistence
    If a command response reports failure, retrying probably isn’t going to help

    View Slide

  41. Retryable errors
    Any network error (e.g. socket timeout, dropped connection)
    Server response clearly indicating a transient error (e.g. “not master”)
    Most commonly caused by a replica set failover or step down

    View Slide

  42. Retrying reads
    Queries that return a single document are always safe to retry
    Short-running queries that return a single batch of documents (i.e. will not leave
    behind a cursor) may be safe to retry
    Drivers will aim to retry most read commands in MongoDB 4.2
    Requires server functionality to detect dropped sockets and abort operations
    getMore cannot be retried, since cursor iteration is forward only

    View Slide

  43. Retrying writes
    Given that:
    ● Sessions are cluster-wide and exist beyond the scope of a connection
    ● Each write can be uniquely identified by a session and statement ID
    ● Drivers can rely on SDAM and server selection to re-select the primary
    Drivers can safely retry single-document writes (or bulks thereof) by resending the
    original command to the primary and trusting the server to Do the Right Thing™
    If the write already executed, return the result we missed
    If the write never executed, do it now and return its result

    View Slide

  44. Retrying wants a server selection loop
    Drivers invoke server selection for each retry attempt
    PHP’s default try-once behavior is unlikely to find a new primary after a failover,
    since replica set elections can take a few seconds (electionTimeoutMillis)
    Reducing election times for planned maintenance (SERVER-35624)
    Combining retryWrites=true with serverSelectionTryOnce=false can fully
    insulate an application’s writes from replica set failovers (https://git.io/fNbW0)

    View Slide

  45. Taking advantage of retryable writes
    Add retryWrites=true to your connection string, disable try-once behavior
    (serverSelectionTryOnce=false), and tune serverSelectionTimeoutMS
    closer to expected election time (e.g. 15 seconds)
    Atlas already advises this, which helps with its automated maintenance
    Use the driver as you would normally
    Multi-document writes (e.g. updateMany) may still fail; you’re no worse off
    Single-document writes may still fail after one retry attempt

    View Slide

  46. 60% of the time…
    It works every time.

    View Slide

  47. Transactions
    ACID compliance only took ten years…

    View Slide

  48. Getting to this point
    MongoDB 3.0 introduced the WiredTiger storage engine
    MongoDB 3.2 made WiredTiger the default, introduced read concerns, and made
    significant improvements to the replication protocol
    MongoDB 3.6 introduced logical sessions, which was the underlying framework for
    causal consistency and retryable writes
    MongoDB 4.0 introduced multi-document transactions for replica sets by
    leveraging the logical session API and WiredTiger storage engine
    MongoDB 4.2 will add transaction support for sharded clusters

    View Slide

  49. Transactions at a glance
    All operations within a transaction must route to the same member (i.e. primary)
    Read and write concerns are specified once, when starting a transaction
    While many operations are supported, there are some restrictions (e.g. DDL)
    Databases and collections must exist prior to starting the transaction
    Cursors created outside a transaction cannot be used within, and vice versa

    View Slide

  50. Transactions in PHP
    require_once 'vendor/autoload.php';
    $client = new MongoDB\Client;
    $session = $client->startSession();
    $session->startTransaction();
    $client->test->foo->insertOne(['x' => 1], ['session' => $session]);
    $client->test->bar->insertOne(['y' => 2], ['session' => $session]);
    $session->commitTransaction();

    View Slide

  51. Retrying transactions
    Drivers automatically retry commit and abort commands once
    Applications can retry commits additional times if desired
    Other read and write operations are not retried. Transactions and retryable
    writes (i.e. retryWrites=true) are mutually exclusive.
    Entire transactions may be retried if an operation fails with a transient error
    Use a majority write concern when retrying transactions for durability

    View Slide

  52. Knowing when to retry transactions
    Any RuntimeException thrown by the driver or library may be associated with one
    or more error labels, which can be checked using the hasErrorLabel() method
    TransientTransactionError implies the entire transaction can be retried
    UnknownTransactionCommitResult implies a commit can be retried
    Applications can, and should, handle both cases (example)

    View Slide

  53. (take a breath)

    View Slide

  54. Resources and Further Reading
    MongoDB PHP driver documentation and specifications
    https://php.net/mongodb
    https://docs.mongodb.com/php-library/
    https://github.com/mongodb/specifications
    MongoDB Manual (CRUD concepts, retryable writes, transactions)
    https://docs.mongodb.com/manual/core/crud/
    https://docs.mongodb.com/manual/core/retryable-writes/
    https://docs.mongodb.com/manual/core/transactions/
    How to Write Resilient MongoDB Applications — A. Jesse Jiryu Davis
    https://emptysqua.re/blog/how-to-write-resilient-mongodb-applications/
    It’s 10pm: Do You Know Where Your Writes Are? — Jeremy Mikola
    https://speakerdeck.com/jmikola/its-10pm-do-you-know-where-your-writes-are

    View Slide

  55. Thanks!
    Jeremy Mikola
    @jmikola

    View Slide

  56. Image Credits
    ● https://docs.mongodb.com/
    ● https://twitter.com/dcousineau/status/613127314545737728
    ● https://imgur.com/gallery/B58uJxA
    ● http://www.kollected.com/Mars-Rover-Curiosity
    ● https://scryfall.com/card/ugl/2/the-cheese-stands-alone
    ● https://www.reddit.com/r/thinkpad/comments/8lzftd/exploded_x220_wallpaper_with_a_slightly_different/
    ● https://skitterphoto.com/photos/232/mixer-knobs
    ● https://www.gsb.stanford.edu/insights/end-traffic-jams-it-might-not-be-dream
    ● https://pixabay.com/en/blueprint-ruler-architecture-964630/
    ● https://obrazki.elektroda.pl/5336891800_1520705708.jpg
    ● https://www.youtube.com/watch?v=IKiSPUc2Jck
    ● http://markinternational.info/coding-hd-wallpaper/222275975.html

    View Slide