Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Automatic Failover in RethinkDB

Automatic Failover in RethinkDB

Jorge Silva

July 27, 2015
Tweet

More Decks by Jorge Silva

Other Decks in Programming

Transcript

  1. Automatic Failover
    Using Raspberry Pis to
    understand automatic
    failover
    RethinkDB Meetup
    San Francisco, California
    July 27, 2015

    View Slide

  2. Jorge Silva
    @thejsj
    Developer Evangelist @ RethinkDB

    View Slide

  3. Distributed Systems
    What makes RethinkDB
    distributed?

    View Slide

  4. What is RethinkDB?
    • Open source database for building
    realtime web applications
    • NoSQL database that
    stores schemaless JSON documents
    • Distributed database that is easy to
    scale

    View Slide

  5. What makes it distributed?
    • Allows simple sharding and
    replication of tables
    • Allows you to easily connect
    nodes to a cluster using `--join`

    View Slide

  6. The problem
    • When one of your nodes goes
    down, you needed to manually
    decide what to do

    View Slide

  7. Automatic Failover
    RethinkDB 2.1

    View Slide

  8. What's new in 2.1
    • RethinkDB 2.1 introduces
    automatic failover
    • It uses Raft as the consensus
    algorithm

    View Slide

  9. Replicas
    • Primary replicas serve as the
    authoritative copy of the data
    • Secondary replicas serve as a
    mirror of the primary replica

    View Slide

  10. Automatic Failover
    • In RethinkDB, automatic failover
    takes care of promoting secondary
    replicas into primary replicas when
    a primary replica is unavailable
    • The cluster picks new primaries by
    voting. New server need a majority
    vote.

    View Slide

  11. Automatic Failover
    Cluster with Raspberry Pis

    View Slide

  12. Step #1: Start RethinkDB
    // Check RethinkDB is running
    $ ssh [email protected]

    View Slide

  13. Step #2: ssh into raspberry pis
    // Check devices in network
    $ nmap -sn *.*.*.0/24
    // ssh into raspberry pis
    $ ssh [email protected]
    $ ssh [email protected]

    View Slide

  14. Step #3: Start RethinkDB
    // Start RethinkDB in both Raspberry Pis
    [email protected] ~ $ rethinkdb \
    -n redisgeek \
    -t pi -t redisgeek \
    --bind all \
    --join 104.236.171.225

    View Slide

  15. Step #4: Check servers
    r.db('rethinkdb').table('server_config')

    View Slide

  16. Step #5: Insert test data
    // Insert data into table
    r.table('data')
    .insert(
    // Insert data form Reddit
    r.http('reddit.com/r/rethinkdb.json')
    ('data')('children').map(r.row('data'))
    )
    // Query data
    r.table('data')

    View Slide

  17. Step #6: Check replica

    View Slide

  18. Automatic Failover
    Demo #1

    View Slide

  19. Step #1: Move data
    // Move all data to `redisgeek`
    r.table('data')
    .reconfigure({
    shards: 1,
    replicas: { 'redisgeek': 1 },
    primaryReplicaTag: 'redisgeek'
    })

    View Slide

  20. Step #2: Disconnect primary

    View Slide

  21. Step #3: Query data
    // Query table
    r.table('data') // Returns Error

    View Slide

  22. What happened?
    • We move all our data in 'redisgeek'
    • We disconnected 'redisgeek' from
    the network
    • Because we can't communicate
    with 'redisgeek' (primary replica),
    our data in inaccessible

    View Slide

  23. Step #4: Replicate data
    // Configure 3 replicas
    r.table('data')
    .reconfigure({
    shards: 1,
    replicas: 3
    })

    View Slide

  24. Step #5: Check replicas

    View Slide

  25. Step #6: Disconnect primary

    View Slide

  26. Step #7: Query data
    // Query table
    r.table('data') // We have data!

    View Slide

  27. Step #8: Insert data
    // Insert data
    r.table('data').insert({ hello: 'world' })

    View Slide

  28. What happened?
    • Node gets promoted to primary
    replica

    View Slide

  29. Step #8: Reconnect primary
    • 'redisgeek' comes back as primary

    View Slide

  30. Automatic Failover
    Replication and failover #2

    View Slide

  31. Step #1: Make main primary
    // Make `main` the primary replica
    r.table('data')
    .reconfigure({
    shards: 1,
    replicas: {
    'main': 1,
    'redisgeek': 1,
    'pishark': 1
    },
    primaryReplicaTag: 'main'
    })

    View Slide

  32. Step #2: Disconnect secondaries

    View Slide

  33. Step #3: Query data
    // Query table
    r.table('data').count() // 25

    View Slide

  34. Step #4: Attempt Insert
    r.table('data')
    .insert({ hello: 'world' }) // Error

    View Slide

  35. What happened?
    • Because the primary replica is
    connected data can be read
    • Because a majority of nodes are
    disconnected, data can't be
    written

    View Slide

  36. Questions
    • RethinkDB website:

    http://rethinkdb.com
    • New Failover Documentation:

    http://docs.rethinkdb.com/2.1/docs/
    failover/
    • Email me: [email protected]
    • Tweet: @thejsj, @rethinkdb

    View Slide