Papers_We_Love
July 27, 2016
1.4k

# PWL Mini w/ Wes Chow on Tiered Replication: A Cost-effective Alternative to Full Cluster Geo-replication

Tiered Replication, by Cidon et. al, explores the problem of effective data replication strategies first introduced in the Copysets paper, awarded 2013 Usenix ATC Best Student Paper. While Copysets introduced a randomized algorithm for solving NP Hard redundancy and load balancing constraints around placement of data in distributed filesystem, Tiered Replication proposes a greedy algorithm for solving the same problem and also adding the ability to bake in real world constraints such as rack awareness. Wes will summarize the problem Copysets proposed, show Tiered Replication’s solution, and examine a real world deployment of the algorithm at Chartbeat.

July 27, 2016

## Transcript

1. ### Copysets + Tiered Replication - Cidon et. al Wes Chow

/ CTO, Chartbeat / @weschow
2. ### Data Replication • Want to store 2 copies of data

for redundancy and performance reasons. • Which nodes do we use? • Random replication: pick 2 nodes completely randomly or deterministically random.
3. ### Random Assignment {A B} {A C} {A D} {A E}

{A F} {B C} … • N = 6, R = 2 • 6 choose 2 = 15 combinations. • Failure of any 2 nodes results in data loss. • 1/15 data on each set.

5. ### Facebook (Riak) Replication {A B} {B C} {C D} {D

E} {E F} {F A} • ⅙ data on each set. • Random two nodes fail, p(loss) = 6/15 = 40%
6. ### Simple Assignment {A B} {C D} {E F} • p(loss)

= 20% • ⅓ data on each set
7. ### Terminology • N = number of nodes • R =

replication factor (# of copies of data) • S = scatter width What is scatter width?
8. ### S = 4 A B C B C D A

restores from B, C, E, F C D E D E F each set = 17% of data E F A P F = 6 / 6c3 = 30% F A B
9. ### S = 2 A B C D E F A

restores from B, C each set = 50% of data P F = 2 / 6c3 = 10%
10. ### Random Assignment {A B} {A C} {A D} {A E}

{A F} {B C} … • p(loss) = 100% • 1/15 data on each set • Scatter width = 5 • E(loss) = 100% * 1 / 15 = 6.7%
11. ### Facebook (Riak) Replication {A B} {B C} {C D} {D

E} {E F} {F A} • p(loss) = 40% • ⅙ data on each set • Scatter width = 2 • E(loss) = 40% * ⅙ = 6.7%
12. ### Simple Assignment {A B} {C D} {E F} • p(loss)

= 20% • ⅓ data on each set • Scatter width = 1 • E(loss) = 20% * ⅓ = 6.7%
13. ### The Importance of S • Affects p(loss). • Affects speed

of restoring single node. • Low S = low p(loss), high damage, slow restore • High S = high p(loss), low damage, fast restore
14. ### The Fixed Cost of Failure • Admitting failure on Twitter

has high fixed cost. Failing for 50% of customers not much worse than 5%. • Going to tape has high fixed cost. Restoring 1 TB not much worse than restoring 1 GB.

16. ### Tiered Replication To construct a copyset: 1. Order nodes from

smallest to largest scatter width. 2. Pick first R nodes. Repeat until all nodes have SW >= S.
17. ### TR With Constraints To construct a copyset: 1. Order nodes

from smallest to largest scatter width. 2. Pick first R nodes satisfying constraints. Repeat until all nodes have SW >= S.
18. ### Possible Constraints • Rack awareness. • Resource differences in nodes.

• Tiered storage. What is that?

20. ### Apache Kafka • High throughput message broker. • Topics broken

into K partitions. • Each partition handled by primary/secondaries. • Classic master/slave replication. • Consumers subscribe to subset of partitions. • Trepl (https://pypi.python.org/pypi/trepl)
21. ### Chartbeat Pings • Browser sends beacon to our servers. •

275,000 / sec into “pings” topic. • “pings” topic broken into 144 partitions. • 6 brokers. • R = 2 (cost reduction from 3) • AZ aware assignment
22. ### Notes • Load balancing. • Copysets is NP-Hard in general.

• Combinatorial design literature. • Tradeoffs. Embrace or reduce catastrophe? Copysets: https://www.usenix.org/conference/atc13/technical-sessions/presentation/cidon TR: https://www.usenix.org/system/files/conference/atc15/atc15-paper-cidon.pdf