Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Replicating MongoDB
Search
Sponsored
·
SiteGround - Reliable hosting with speed, security, and support you can count on.
→
Philipp Krenn
November 15, 2014
Programming
0
210
Replicating MongoDB
...what could go wrong?
Presentation for the Javantura conference in Zagreb.
Philipp Krenn
November 15, 2014
Tweet
Share
More Decks by Philipp Krenn
See All by Philipp Krenn
Full-Text Search Explained
xeraa
11
2.4k
360° Monitoring of Your Microservices
xeraa
7
3.5k
Scale Your Metrics with Elasticsearch
xeraa
4
160
YAML Considered Harmful
xeraa
0
2.1k
Scale Your Elasticsearch Cluster
xeraa
1
330
Hands-On ModSecurity and Logging
xeraa
2
210
Centralized Logging Patterns
xeraa
1
1.2k
Dashboards for Your Management with Kibana Canvas
xeraa
1
490
Make Your Data FABulous
xeraa
3
1k
Other Decks in Programming
See All in Programming
Spinner 軸ズレ現象を調べたらレンダリング深淵に飲まれた #レバテックMeetup
bengo4com
1
230
KIKI_MBSD Cybersecurity Challenges 2025
ikema
0
1.3k
開発者から情シスまで - 多様なユーザー層に届けるAPI提供戦略 / Postman API Night Okinawa 2026 Winter
tasshi
0
200
React 19でつくる「気持ちいいUI」- 楽観的UIのすすめ
himorishige
11
6k
AIによるイベントストーミング図からのコード生成 / AI-powered code generation from Event Storming diagrams
nrslib
2
1.8k
CSC307 Lecture 07
javiergs
PRO
0
550
AI時代のキャリアプラン「技術の引力」からの脱出と「問い」へのいざない / tech-gravity
minodriven
20
6.9k
Vibe Coding - AI 驅動的軟體開發
mickyp100
0
170
The Past, Present, and Future of Enterprise Java
ivargrimstad
0
510
Honoを使ったリモートMCPサーバでAIツールとの連携を加速させる!
tosuri13
1
170
それ、本当に安全? ファイルアップロードで見落としがちなセキュリティリスクと対策
penpeen
7
2.4k
AWS re:Invent 2025参加 直前 Seattle-Tacoma Airport(SEA)におけるハードウェア紛失インシデントLT
tetutetu214
2
110
Featured
See All Featured
Reflections from 52 weeks, 52 projects
jeffersonlam
356
21k
[SF Ruby Conf 2025] Rails X
palkan
1
740
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
508
140k
Google's AI Overviews - The New Search
badams
0
900
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
34
2.6k
Agile that works and the tools we love
rasmusluckow
331
21k
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
37
6.3k
The browser strikes back
jonoalderson
0
360
Building Experiences: Design Systems, User Experience, and Full Site Editing
marktimemedia
0
410
jQuery: Nuts, Bolts and Bling
dougneiner
65
8.4k
Unlocking the hidden potential of vector embeddings in international SEO
frankvandijk
0
170
Leo the Paperboy
mayatellez
4
1.4k
Transcript
MongoDB Replication
Philipp Krenn @xeraa ecosio & ViennaDB
Motivation Availability & data safety Read scalability Helping backups
Data migration Delayed members Oplog Tailing (Meteor.js) https://meteorhacks.com/mongodb-oplog-and-meteor.html
Basics
Terminology Primary + Secondaries Master + Slaves problematic — renamed
Arbiter
http://docs.mongodb.org
http://docs.mongodb.org
http://docs.mongodb.org > rs.addArb("arbiter.example.com:3000")
http://docs.mongodb.org
Limits 50 replica set members 12 before 2.7.8 7 voting
members
Example
None
Single instance $ mkdir 1 $ mongod --dbpath 1 --port
27001 --logpath log1 $ mongo --port 27001 > db.test.insert({ name: "Philipp", city: "Wien" }) > db.test.find() Stop instance
Add replication $ mkdir 2 $ mkdir 3 $ mongod
--replSet javantura --dbpath 1 --port 27001 --logpath log1 --oplogSize 20 $ mongod --replSet javantura --dbpath 2 --port 27002 --logpath log2 --oplogSize 20 $ mongod --replSet javantura --dbpath 3 --port 27003 --logpath log3 --oplogSize 20
Connect $ hostname $ mongo --port 27001 > db.test.find()
Configure replication Start on the old instance, otherwise data lost
rs.initiate() rs.status() rs.add("PK-MBP:27002") rs.add("PK-MBP:27003") rs.status() db.isMaster() db.test.find() db.test.insert({ name: "Peter", city: "Steyr" }) db.test.find()
Read from secondaries $ mongo --port 27002 > db.test.find() >
rs.slaveOk() > db.test.find() > db.test.insert({ name: "Dieter", city: "Graz" }) slaveOk only valid for the current connection
Failover Kill primary with [Ctrl]+[C] Write to new primary >
rs.status() > db.test.insert({ name: "Dieter", city: "Graz" }) > db.test.find()
Restart old primary $ mongod --replSet name --dbpath 1 --port
27001 --logpath log1 --oplogSize 20 $ mongo --port 27001 > rs.status() > rs.slaveOk() > db.test.find()
Election
Heartbeat 2s interval 10s until election
Election rules 1. Priority 2. Optime 3. Connections
Priority cfg = rs.conf() cfg.members[0].priority = 0 cfg.members[1].priority = 1
cfg.members[2].priority = 2 rs.reconfig(cfg)
Optime
Connections
Election Candidate node asks for a vote Others can veto
Election One yes for one node within 30s Majority yes
elects a new primary
None
Issues
CAP Select Availability or Consistency Partition-tolerance is a prerequisite for
distributed systems "The network is reliable": http://aphyr.com/posts/288-the-network-is-reliable
Rollback Old primary rolls back unreplicated changes once it rejoins
the replica set
Rollback file rollback/ in data folder File name: <database>.<collection>. <timestamp>.bson
Election time At times 5 to 7 minutes http://www.tokutek.com/2014/07/explaining-ark- part-2-how-elections-and-failover-currently-work/
Missing synchronization during election Old primary sends last changes to
a single node If not new primary: rollback
Remember Replication is asynchronous
Multiple primaries Unlikely but possible Bugs: https://jira.mongodb.org/browse/SERVER-9765 Test script with
no replies: https://groups.google.com/ forum/#!topic/mongodb-dev/-mH6BOYyzeI
Kyle Kingsbury @aphyr: Call Me Maybe http://aphyr.com/tags/jepsen PostgreSQL, Redis, MongoDB,
Riak, Zookeeper, RabbitMQ, etcd + Consul, ElasticSearch
http://aphyr.com/posts/284-call-me- maybe-mongodb 05/2013 version 2.4 Up to 42% data lost
Data written to old primary: rollback
None
WriteConcern Configure durability vs performance https://github.com/mongodb/mongo-java-driver/blob/ master/src/main/com/mongodb/WriteConcern.java
WriteConcern. UNACKNOWLEDGED w=0, j=0 Fire and forget Default until 11/2012
None
WriteConcern. ACKNOWLEDGED w=1, j=0 Current default Operation successful in memory
WriteConcern. JOURNALED w=1, j=1 Operation written to the journal file
Since 1.8, single server durability
WriteConcern.FSYNCED w=1, fsync=true Operation written to disk
WriteConcern. REPLICA_ACKNOWLEDGED w=2, j=0 Acknowledged by primary and at least
one secondary w is the server number
WriteConcern. MAJORITY w=majority, j=0 Acknowledgement by the majority of nodes
wtimeout recommended
WriteConcern. MAJORITY Nearly no data lost, but high overhead
Write concern performance https://blog.serverdensity.com/mongodb-on-google- compute-engine-tips-and-benchmarks/ 3 x 1,000 inserts on
GCE Local 10GB system disk Dedicated 200GB disk Dedicated 200GB for data and journal
n1-standard-2
n1-highmem-8
Thanks! Questions? Now, later today, or @xeraa
Backup Slides
Oplog
Replication via logs MongoDB: Operations log (Oplog) MySQL: Binary log
(Binlog)
Naiv approach: Transmit original query Statement Based Replication (SBR) DELETE
FROM test.table WHERE quantity > 20 LIMIT 1 db.collection.remove({ quantity: { $gt: 20 }}, true) //justOne: true
Unambiguous representation Row-Based Replication (RBR): Oplog
MongoDB Asynchronous replication Secondaries can get the Oplog from: their
primary a secondary with more recent data
Oplog size 32bit: 48MB 64bit OS X: 183MB 64bit *nix,
Windows: 1GB to 50GB (5% free disk)
Inner details
Capped collection in oplog.rs of the local database > use
local > show collections me 0.000MB / 0.008MB oplog.rs 0.000MB / 20.000MB replset.minvalid 0.000MB / 0.008MB slaves 0.000MB / 0.008MB startup_log 0.003MB / 10.000MB system.indexes 0.001MB / 0.008MB system.replset 0.000MB / 0.008MB
> db.oplog.rs.find() { "h": NumberLong("-265486071808715859"), "ns": "test.test", "o": { "_id":
ObjectId("541a8ed285ea5f8ae059d530"), "name": "Dieter" "city": "Graz" }, "op": "i", "ts": Timestamp(1411026642, 1), "v": 2 } ...