Upgrade to Pro — share decks privately, control downloads, hide ads and more …

D2-I4 Dan Reif - Massively Distributed Backups at Facebook Scale

D2-I4 Dan Reif - Massively Distributed Backups at Facebook Scale

Ignite Talk

MySQL is at the core of Facebook’s persistent storage. The graph itself, including every like, comment, post and status, is stored in MySQL, along with many other things. This data is the company’s most important asset, and we take great care to make sure everything is properly backed up. Yes, even the lolcats and puppy picture posts. Everything!

As you can imagine, backing up this behemoth of a dataset is quite a challenge. The backup system Facebook runs for MySQL is multi-tiered and massively distributed. We employ binary log, full, and differential backups and clever hacks to balance speed, space and reliability.

In this talk you’ll learn how we backup Facebook, every single day. We’ll go over the design, engineering and operational challenges we’ve had to overcome, and wrap up with some fun war stories.

DevOps Relevance: At the core of DevOps is monitoring and orchestration. The talk is not built as a polished set-piece, but instead as a series of improvements (and snags we hit along the way). The overall theme is one of managing complexity via code rather than with more humans.

DevOpsDays Zurich

May 09, 2017
Tweet

More Decks by DevOpsDays Zurich

Other Decks in Technology

Transcript

  1. { "id": "59520506051", "name": "Dan Reif" } graph API Example

    graph.facebook.com/me Art based on https://www.flickr.com/photos/vanf/6548427219 (CC-BY) "ID"!? fbid is 64-bit integer
 map(fbid) ⊢ shard ID
  2. { "id": "59520506051", "name": "Dan Reif" } graph API Example

    graph.facebook.com/me Art based on https://www.flickr.com/photos/vanf/6548427219 (CC-BY) "ID"!? fbid is 64-bit integer
 map(fbid) ⊢ shard ID
  3. server instance shard #4 shard #3 shard #2 shard #1

    server instance shard #4 shard #3 shard #2 shard #1 server instance shard #4 shard #3 shard #2 shard #1 Master Slaves Replica Set
  4. 100

  5. We use mysqldump. Why? Logical Physical External Tools Yes No

    Size Small Large Single Table Restore Easy Difficult Debug Corruption Easy Difficult Compressibility Excellent Meh Backup / Restore Duration Long Short
  6. Differential Backup 0 2 4 6 8 % of space

    taken by differential backups Day 1 Day 2 Day 3 Day 4 0 25 50 75 100 Relative backup space usage Day 1 Day 2 Day 3 Day 4 Full Backup Differential Backup INSERT INTO t1 VALUES ( (2, ‘Santa Clara’), (400, ‘Los Angeles’), INSERT INTO t1 VALUES ( (2, ‘OakLand’), (3, ‘Menlo Park’), Inserted/Updated Rows Deleted/Replaced Rows Diff Format:
  7. • Dump Everything, Every Day • Backup Binary Logs •

    Point in time • GTID • Multiple stages • HDFS • Offsite Backup Schedule What, When, Where Full 5 Diff 6 Diff 7 Diff 8 Full 1 Diff 2 Diff 3 Diff 4
  8. FRC FRC1 FRC3 FRC1:01 FRC1:02 FRC3:01 FRC3:02 DB
 85% DB


    50% HDFS
 85% HDFS
 50% DB
 25% HDFS
 25% DB
 75% HDFS
 75% Previous Generation Backup System
  9. • Network too slow? • Fastest hardware only in one

    datacenter? • Scheduling is hard? • Shouldn’t backup from slaves? Breaking Assumptions
  10. • Network too slow? • Fastest hardware only in one

    datacenter? • Scheduling is hard? • Shouldn’t backup from slaves? Breaking Assumptions
  11. HDFS 2 HDFS 1 HDFS 3 HDFS 4 HDFS 5

    101 400 100 401 600 601 850 851 1000 1 PB 3 PB 2 PB 2.5 1.5 Bucket 20 Bucket 200 Bucket 500 Bucket 650 Bucket 900 Bucket 30 Bucket 400 1 Bucket 700 Bucket 800 • Hash shards into 1000 buckets • Allocate buckets to HDFS clusters, proportional to size
  12. Δ priority C, ɣ 0 0 A, α 0 1

    B, α 10 1 B, ɣ 15 0 A, ɣ 20 0 C, α 20 1
  13. • Records previous row value • Binary Logs + Binary

    Logs => Diff • Full + Diff => Full • In theory, run full backup only once! The fuutuuure: Row-Based Binary Logs
  14. 
 Dan Reif([email protected]) P.S. We’re hiring! London, Dublin, California, NYC...

    Image: https://www.flickr.com/photos/ seanandlauren/4659827304, CC-BY
  15. 
 Dan Reif([email protected]) P.S. We’re hiring! London, Dublin, California, NYC...

    Image: https://www.flickr.com/photos/ seanandlauren/4659827304, CC-BY