Riak Enterprise Revisited (RICON East 2013)

Enterprise Reloaded Replication in Record Time Chris Tilt ([email protected]) Basho
Technologies RICON East 2013

Talk • Riak Enterprise Overview • Focus: the “Brave New
World” of replication • New Features in 1.3 (Released) • What’s coming in 1.4 • Futures

Riak Enterprise • Built upon open-source Riak • 24 x
7 Legendary Basho Technical Support • Closed-source • Extended Monitoring • Multi Data Center Replication*

Multi Data Center Replication Cluster A Cluster B

Multi Data Center Replication Cluster A Cluster B source sink

Multi Data Center Replication Cluster A Cluster B riak objects
source sink

Use Cases • Primary cluster with hot failover • Availability
Zones: active-active clusters create data locality and reduce latency • Reporting/Analytics

Primary with failover Cluster A Cluster B Uni-directional sync DNS
Director

Primary with failover Cluster A Cluster B DNS Director

Availability Zones DNS Director Cluster Cluster Cluster Cluster Cluster Cluster

Reporting/Analytics Cluster A Cluster B Uni-directional sync Report Generator

Replication Protocols Cluster A Cluster B riak objects source sink
Realtime

Cluster A Cluster B riak objects source sink Realtime Fullsync
Replication Protocols

Cluster A Cluster B riak objects source sink Realtime Fullsync
Proxy GET (for riak CS) Replication Protocols

The Old Way Cluster A Cluster B Realtime Fullsync Proxy
GET ... ... single, multi-plexed connection

GET ... ... “listeners”

GET ... ... “sites”

Lessons Learned • A single shared TCP/IP connection, for all
replication, doesn’t scale well.

replication, doesn’t scale well. • Make it easy to conﬁgure!

replication, doesn’t scale well. • Make it easy to conﬁgure! • Dropping realtime objects during intermittent connectivity is not OK.

replication, doesn’t scale well. • Make it easy to conﬁgure! • Dropping realtime objects during intermittent connectivity is not OK. • Networks are unreliable, even in high-end data centers. Connectivity is hard.

replication, doesn’t scale well. • Make it easy to conﬁgure! • Dropping realtime objects during intermittent connectivity is not OK. • Networks are unreliable, even in high-end data centers. Connectivity is hard. • Load balancing is critical for fullsync.

Big Pain Motivates Big Ideas.

The Brave New World riak 1.3 • Ground up re-write
of replication • Node to Node connections • All connections start at single IP:Port • Each protocol has it’s own channel

The Brave New World riak 1.3 • Realtime queues separate
from connections • Fullsync coordinator controls work load • Connection Manager (now moved to Core) handles backoff and retry • Much simpler command and conﬁguration

Praise • Andy Gross (the epoch) • Andrew Thomson •
Chris Tilt • Dave Parﬁtt • Jon Merideth • Micah Warren

Node to Node Connections Cluster A Cluster B source sink
Realtime 1:1

Realtime Write Cluster A Cluster B source sink Client PUT

Realtime Queues • Hook on Post commit • Push objects
to ETS Queue • RT connection pulls from Queue • Bounded, drop objects when full • On shutdown, proxy objects to peer node

Fullsync to the Max = N:M Cluster A Cluster B
source sink e.g. Fullsync 2:2

Fullsync Coordinator workload balancing • Schedules each partition on it’s
vnode • Reservation system on each sink node • Respects max_fssource_node and a “busy” response from a sink node • Caps connections at max_fssource_cluster

Fullsync to the Max = N:M Cluster A Cluster B
source sink Fullsync 2:2, max_fssource_cluster = 10

Fullsync Coordinator Cluster A Cluster B source sink Fullsync 2:2,
max_fssouce_cluster = 5

Fullsync Write Cluster A P Cluster B P source sink

Riak CS Replication blocks not replicated Cluster A Cluster B
source sink Client GET block

Riak CS Replication Cluster A Cluster B source sink Client
GET block of manifest

Riak CS Replication Cluster A Cluster B source sink Proxy
GET Client GET block

Riak CS Replication Cluster A Cluster B source sink Client
GET block

Riak Cloud Storage See Reid Draper’s Riak CS talk tomorrow!

Cluster Manager For Easier Conﬁguration • riak-repl clustername A •
riak-repl connect 192.168.1.100 • riak-repl realtime enable B • riak-repl realtime start B

Coming in 1.4 • Secure Sockets Layer • Network Address
Translation • Proxied GET for Riak CS • Fine tuned fullsync concurrency controls

Coming in 1.4 • Better per-connection stats for RT and
FS • Realtime Chaining amongst multiple clusters • Technology preview of AAE Fullsync

Realtime Chaining across Mutliple Data Centers Cluster A Cluster B
Cluster C

Realtime Write Cluster A Cluster B Cluster C Client PUT

Realtime Write Cluster A Cluster B Cluster C Client

Realtime Write Cluster A Cluster B Cluster C Client Oops!

Realtime Chaining Cluster A Cluster B Cluster C Client

Realtime Chaining Cluster A Cluster B Cluster C Client /

Faster Fullsync • Current Keylist method is slow • Riak
1.3 has Active Anti-Entropy • Technology Preview AAE Fullsync!

Keylist Compare • Source and Sink each create a keylist
file • For all keys, write hash of key and object to file • Send the Keylist file from sink to source • Source side compares it’s file to sink’s file

Keylist Compare • Each cluster has to fold over entire
key space • Time is linear with number of keys, K • Network trafﬁc also liner with K • Fullsync of a 0% update still costs K. Ouch. • Discourages frequent syncrhonizing. Boo.

Active Anti-Entropy • AAE Maintains a hash tree • real-time
updates • persistent • non-blocking

64 hash exchange

Fullsync AAE • Implement the AAE exchange over TCP/IP •
Expect compare time to be linear with %differences • No additional read load from a fold :-)

Fullsync Benchmarks measure key compare time Cluster A Cluster B
source sink 1 M Keys 1 M - (%missing x 1M) Keys

Keylist vs AAE 0 150 300 450 600 100 50
10 00.5 64 222 383 488 521 500 486 key compare time (secs) % missing Keylist AAE

What’s UP?

What’s Up? • Fast local cloning of a data center
• Per-bucket replication between multiple data centers • Support AAE fullsync over clusters of differing ring sizes • Replication of CRDTs across clusters • Strong consistency

Thanks! [email protected]

Riak Enterprise Revisited (RICON East 2013)

Riak Enterprise Revisited (RICON East 2013)

More Decks by Basho Technologies

Other Decks in Technology

Featured

Transcript