Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Data Safety in MongoDB

Data Safety in MongoDB

Mathias Stearn

May 22, 2012
Tweet

More Decks by Mathias Stearn

Other Decks in Programming

Transcript

  1. Replication Journaling Write Concerns Data Safety in MongoDB Only You

    Can Protect Your Data! Mathias Stearn @mathias_mongo #MongoNYC Find this presentation at speakerdeck.com/u/mathias Mathias Stearn @mathias_mongo Data Safety in MongoDB
  2. Replication Journaling Write Concerns MongoDB gives you the tools, it

    is up to you to use them Don’t assume defaults are best There is no one size fits all What conditions do you want to protect against? Performance vs Safety vs Cost Tradeoffs Mathias Stearn @mathias_mongo Data Safety in MongoDB
  3. Replication Journaling Write Concerns 1 Replication 2 Journaling 3 Write

    Concerns Mathias Stearn @mathias_mongo Data Safety in MongoDB
  4. Replication Journaling Write Concerns 1 Replication 2 Journaling 3 Write

    Concerns Mathias Stearn @mathias_mongo Data Safety in MongoDB
  5. Replication Journaling Write Concerns Demo Time 3x mongod --replSet SomeName

    rs.initiate() rs.add(“host:port”) rs.add(“host:port”) Mathias Stearn @mathias_mongo Data Safety in MongoDB
  6. Replication Journaling Write Concerns Bad Idea Single Server Replica 1

    Replica 2 Replica 3 Mathias Stearn @mathias_mongo Data Safety in MongoDB
  7. Replication Journaling Write Concerns Still a Bad Idea Single Server

    Virtual Machine 1 Replica 1 Virtual Machine 2 Replica 2 Virtual Machine 3 Replica 3 Mathias Stearn @mathias_mongo Data Safety in MongoDB
  8. Replication Journaling Write Concerns Ok Idea Single Rack Server 1

    Server 2 Server 3 Mathias Stearn @mathias_mongo Data Safety in MongoDB
  9. Replication Journaling Write Concerns Ideal Deployment Data Center 1 Data

    Center 2 Server 1 Server 2 Data Center 3 Server 3 Mathias Stearn @mathias_mongo Data Safety in MongoDB
  10. Replication Journaling Write Concerns Common EC2 Deployment US East Region

    Availability Zone 1 US West Region Replica 1 Availability Zone 3 Replica 3 Availability Zone 2 Replica 2 Mathias Stearn @mathias_mongo Data Safety in MongoDB
  11. Replication Journaling Write Concerns Sharding Bad Idea Data Center 1

    Data Center 2 Data Center 3 Shard 1 Shard 2 Shard 3 Shard 1 Shard 2 Shard 3 Shard 1 Shard 2 Shard 3 Mathias Stearn @mathias_mongo Data Safety in MongoDB
  12. Replication Journaling Write Concerns Sharding Good Idea Data Center 1

    Data Center 2 Data Center 3 Shard 1 Shard 1 Shard 1 Shard 2 Shard 2 Shard 2 Shard 3 Shard 3 Shard 3 Mathias Stearn @mathias_mongo Data Safety in MongoDB
  13. Replication Journaling Write Concerns Things to Know About SlaveDelay Allows

    a slave to be rolling backup Rollbacks Writes not written to majority of nodes may be rolled back Multi-document operation Each document is independently replicated Mathias Stearn @mathias_mongo Data Safety in MongoDB
  14. Replication Journaling Write Concerns 1 Replication 2 Journaling 3 Write

    Concerns Mathias Stearn @mathias_mongo Data Safety in MongoDB
  15. Replication Journaling Write Concerns If you use --nojournal, assume your

    data files are garbage after unclean shutdown Mathias Stearn @mathias_mongo Data Safety in MongoDB
  16. Replication Journaling Write Concerns If you use --nojournal, assume your

    data files are garbage after unclean shutdown But I read a blog that said I’m safe if I . . . Mathias Stearn @mathias_mongo Data Safety in MongoDB
  17. Replication Journaling Write Concerns If you use --nojournal, assume your

    data files are garbage after unclean shutdown But I read a blog that said I’m safe if I . . . No! See item 1! Mathias Stearn @mathias_mongo Data Safety in MongoDB
  18. Replication Journaling Write Concerns Repair It is a best effort

    attempt (like fsck) Doesn’t make any guarantees Don’t rely on it Mathias Stearn @mathias_mongo Data Safety in MongoDB
  19. Replication Journaling Write Concerns Times It Is OK To Turn

    Journaling Off You can recreate the data Initial Import Mongo as a cache Replicating across many data centers If 3 DCs in different continents go down simultaneously, you probably have bigger problems When a host has an unclean shutdown you must delete the data files and restore or resync Otherwise, just leave it on. Mathias Stearn @mathias_mongo Data Safety in MongoDB
  20. Replication Journaling Write Concerns 1 Replication 2 Journaling 3 Write

    Concerns Mathias Stearn @mathias_mongo Data Safety in MongoDB
  21. Replication Journaling Write Concerns By default, drivers don’t wait for

    replies to writes Client Server Insert Insert Insert Insert Insert Insert Mathias Stearn @mathias_mongo Data Safety in MongoDB
  22. Replication Journaling Write Concerns But you can make them wait

    by calling “GetLastError()” Client Server Insert GLE ACK Insert GLE ACK Mathias Stearn @mathias_mongo Data Safety in MongoDB
  23. Replication Journaling Write Concerns You can even wait for replication

    “w=2” or “w=majority” Client Master Slave Insert GLE ACK Insert GLE ACK Mathias Stearn @mathias_mongo Data Safety in MongoDB
  24. Replication Journaling Write Concerns Or wait for a journal commit

    “j=true” Client Master Insert GLE /dev/sda Insert GLE Still, beware of rollbacks Mathias Stearn @mathias_mongo Data Safety in MongoDB
  25. Replication Journaling Write Concerns Python Examples connection = pymongo .

    Connection ( ) db = connection [ ’ t e s t ’ ] # No GetLastError db . s t u f f . i n s e r t ( { ’ hello ’ : ’ world ’ } ) # Sends GetLastError ( Safe−mode) db . s t u f f . i n s e r t ( { ’ hello ’ : ’ world ’ } , safe=True ) # Waits f o r r e p l i c a t i o n db . s t u f f . i n s e r t ( { ’ hello ’ : ’ world ’ } , w= ’ majority ’ ) Mathias Stearn @mathias_mongo Data Safety in MongoDB
  26. Replication Journaling Write Concerns Python Examples # Can set automatic

    safety at many leve ls connection = pymongo . Connection ( safe=True ) db . set_lasterror_options (w=2) db . cache . safe = False db . on_disk . set_lasterror_options ( j =True ) # No GetLastError db . cache . i n s e r t ( { ’ hello ’ : ’ world ’ } ) # Waits f o r r e p l i c a t i o n db . r e p l i c a t e d . i n s e r t ( { ’ hello ’ : ’ world ’ } ) # Waits f o r journ a l but not r e p l i c a t i o n db . on_disk . i n s e r t ( { ’ hello ’ : ’ world ’ } ) Mathias Stearn @mathias_mongo Data Safety in MongoDB
  27. Replication Journaling Write Concerns Questions? Links http://speakerdeck.com/u/mathias http://www.mongodb.org #mongodb on

    irc.freenode.net mongodb-user on google groups Contact [email protected] @mathias_mongo Mathias Stearn @mathias_mongo Data Safety in MongoDB