Upgrade to Pro — share decks privately, control downloads, hide ads and more …

An Overview of the Facebook Cache (Yannick Gingras)

An Overview of the Facebook Cache (Yannick Gingras)


PyCon Canada

August 13, 2013


  1. None
  2. The Facebook Cache Infrastructure Scaling in-memory data stores Yannick Gingras

    <yannick@fb.com> PyCon Canada – 2013-08-11
  3. What is Cache? Why do we cache?

  4. Reasons for caching ▪ Almost 1000:1 read/write ratio ▪ Extreme

    data dependency
  5. Memcache: a key value store

  6. Memcache Memcache is deployed as a demand-filled look-aside key-value cache.

  7. Memcache Data is shared across multiple servers using consistent hashing.

    Failed hosts are replaced with hot spares.
  8. Speed of light is slow Note: not the actual data

    center presence map 140 ms
  9. Our Memcache syncs via MySQL replication

  10. Memcache Some numbers ▪ Thousands of servers ▪ > 1G

    Ops/s ▪ >1T items ▪ 98.1% hit rate in “wildcard” ▪ ~90% hit rate in “regional” ▪ <50% hit rate in “pyk”
  11. Python for Memcache scaling

  12. Python for Memcache mcconf ▪ Short deployment cycle ▪ Pool

    management: allocation, resizing ▪ Spare selection based on hardware requirements ▪ Template-based region bootstrapping ▪ Cluster maintenance and decommission
  13. Python for Memcache Adaptive deployments: mcroll and mcpush ▪ Software

    upgrades ▪ Cold rolls / cache flushing ▪ Rated are adaptive based on health metrics ▪ Global parallelism logic
  14. The social graph: nodes and edges

  15. The graph data model

  16. TAO: Associations and Objects

  17. TAO TAO is a two-level read-through, write-through cache. TAO is

    aware of graph semantic and supports structured queries.
  18. TAO Some numbers ▪ thousands of machines ▪ >1G Ops/s

    ▪ 97.5% hit rate on followers
  19. Python for TAO

  20. TAO – a fun story

  21. TAO – a fun story

  22. Python for TAO Shard splitting / replication ▪ Extension of

    consistent hashing ▪ Based on client machine ID ▪ Wired with the invalidation pipeline Shard placement ▪ Two-level load distribution ▪ Hash table of hot shards mapped to cold servers ▪ Falls back to consistent hashing if shards are not placed ▪ Candidate shards and destinations are identified by Python services
  23. TAO – another story PHP Arrays “An array in PHP

    is actually an ordered map. A map is a type that associates values to keys.” – http://php.net/manual/en/language.types.array.php Python dictionaries “Unlike sequences, which are indexed by a range of numbers, dictionaries are indexed by keys, which can be any immutable type; strings and numbers can always be keys. […] It is best to think of a dictionary as an unordered set of key: value pairs, with the requirement that the keys are unique (within one dictionary).” – http://docs.python.org/2/tutorial/datastructures.html
  24. More Python in the Cache infrastructure ▪ FBAR – auto-remediation

    engine ▪ Tupperware – job engine used for invalidation pipeline ▪ thrift – language-agnostic service layer, enables many Python clients http://thrift.apache.org/ ▪ Dataswarm – Python frontend to our data warehouse ▪ fbdeploy – job supervisor and BitTorrent deployment
  25. The Facebook Cache: wrapping up

  26. Further reading ▪ Memcache public page: https://www.facebook.com/MemcacheAtFacebook ▪ Memcache paper:

    http://bit.ly/fb-memcache-paper ▪ TAO public note: http://bit.ly/tao-blog-post ▪ TAO Paper: http://bit.ly/fb-tao-paper
  27. (c) 2009 Facebook, Inc. or its licensors. "Facebook" is a

    registered trademark of Facebook, Inc.. All rights reserved. 1.0
  28. Memcache: advanced topics

  29. Memcache Thundering herd problem – leases Database Memcache Web Server

    Web Server Web Server
  30. Memcache Read after write semantic – remote markers Replica DB

    Memcache Web Server Master DB 2. Write to master 3. Delete from memcache 5. Delete remote marker 4. Mysql replication 1. Set remote marker
  31. Memcache Aggregated deletes • Reduce packet rate by 18x. MC

    MC MC Aqueduct DB Aqueduct DB Aqueduct DB MC MC MC MC Memcache Routers Memcache Routers MC MC MC MC Memcache Routers Memcache Routers
  32. TAO: advanced topics

  33. Other TAO Topics ▪ TACO ▪ CCW via failover ▪

    Two-level cache provide read after write semantic ▪ Two-level cache shields against thundering herd
  34. (c) 2009 Facebook, Inc. or its licensors. "Facebook" is a

    registered trademark of Facebook, Inc.. All rights reserved. 1.0