Automatic Failover in RethinkDB

Automatic Failover Using Raspberry Pis to understand automatic failover RethinkDB
Meetup San Francisco, California July 27, 2015

Jorge Silva @thejsj Developer Evangelist @ RethinkDB

Distributed Systems What makes RethinkDB distributed?

What is RethinkDB? • Open source database for building realtime
web applications • NoSQL database that stores schemaless JSON documents • Distributed database that is easy to scale

What makes it distributed? • Allows simple sharding and replication
of tables • Allows you to easily connect nodes to a cluster using `--join`

The problem • When one of your nodes goes down,
you needed to manually decide what to do

Automatic Failover RethinkDB 2.1

What's new in 2.1 • RethinkDB 2.1 introduces automatic failover
• It uses Raft as the consensus algorithm

Replicas • Primary replicas serve as the authoritative copy of
the data • Secondary replicas serve as a mirror of the primary replica

Automatic Failover • In RethinkDB, automatic failover takes care of
promoting secondary replicas into primary replicas when a primary replica is unavailable • The cluster picks new primaries by voting. New server need a majority vote.

Automatic Failover Cluster with Raspberry Pis

Step #1: Start RethinkDB // Check RethinkDB is running $
ssh [email protected]

Step #2: ssh into raspberry pis // Check devices in
network $ nmap -sn *.*.*.0/24 // ssh into raspberry pis $ ssh pi@redisgeek $ ssh pi@pishark

Step #3: Start RethinkDB // Start RethinkDB in both Raspberry
Pis pi@mrpi1 ~ $ rethinkdb \ -n redisgeek \ -t pi -t redisgeek \ --bind all \ --join 104.236.171.225

Step #4: Check servers r.db('rethinkdb').table('server_config')

Step #5: Insert test data // Insert data into table
r.table('data') .insert( // Insert data form Reddit r.http('reddit.com/r/rethinkdb.json') ('data')('children').map(r.row('data')) ) // Query data r.table('data')

Step #6: Check replica

Automatic Failover Demo #1

Step #1: Move data // Move all data to `redisgeek`
r.table('data') .reconfigure({ shards: 1, replicas: { 'redisgeek': 1 }, primaryReplicaTag: 'redisgeek' })

Step #2: Disconnect primary

Step #3: Query data // Query table r.table('data') // Returns
Error

What happened? • We move all our data in 'redisgeek'
• We disconnected 'redisgeek' from the network • Because we can't communicate with 'redisgeek' (primary replica), our data in inaccessible

Step #4: Replicate data // Configure 3 replicas r.table('data') .reconfigure({
shards: 1, replicas: 3 })

Step #5: Check replicas

Step #6: Disconnect primary

Step #7: Query data // Query table r.table('data') // We
have data!

Step #8: Insert data // Insert data r.table('data').insert({ hello: 'world'
})

What happened? • Node gets promoted to primary replica

Step #8: Reconnect primary • 'redisgeek' comes back as primary

Automatic Failover Replication and failover #2

Step #1: Make main primary // Make `main` the primary
replica r.table('data') .reconfigure({ shards: 1, replicas: { 'main': 1, 'redisgeek': 1, 'pishark': 1 }, primaryReplicaTag: 'main' })

Step #2: Disconnect secondaries

Step #3: Query data // Query table r.table('data').count() // 25

Step #4: Attempt Insert r.table('data') .insert({ hello: 'world' }) //
Error

What happened? • Because the primary replica is connected data
can be read • Because a majority of nodes are disconnected, data can't be written

Questions • RethinkDB website:  http://rethinkdb.com • New Failover Documentation:  http://docs.rethinkdb.com/2.1/docs/
failover/ • Email me: [email protected] • Tweet: @thejsj, @rethinkdb

Automatic Failover in RethinkDB

Automatic Failover in RethinkDB

More Decks by Jorge Silva

Other Decks in Programming

Featured

Transcript