An Overview of the Facebook Cache (Yannick Gingras)

The Facebook Cache Infrastructure Scaling in-memory data stores Yannick Gingras
<[email protected]> PyCon Canada – 2013-08-11

What is Cache? Why do we cache?

Reasons for caching ▪ Almost 1000:1 read/write ratio ▪ Extreme
data dependency

Memcache: a key value store

Memcache Memcache is deployed as a demand-filled look-aside key-value cache.

Memcache Data is shared across multiple servers using consistent hashing.
Failed hosts are replaced with hot spares.

Speed of light is slow Note: not the actual data
center presence map 140 ms

Our Memcache syncs via MySQL replication

Memcache Some numbers ▪ Thousands of servers ▪ > 1G
Ops/s ▪ >1T items ▪ 98.1% hit rate in “wildcard” ▪ ~90% hit rate in “regional” ▪ <50% hit rate in “pyk”

Python for Memcache scaling

Python for Memcache mcconf ▪ Short deployment cycle ▪ Pool
management: allocation, resizing ▪ Spare selection based on hardware requirements ▪ Template-based region bootstrapping ▪ Cluster maintenance and decommission

Python for Memcache Adaptive deployments: mcroll and mcpush ▪ Software
upgrades ▪ Cold rolls / cache flushing ▪ Rated are adaptive based on health metrics ▪ Global parallelism logic

The social graph: nodes and edges

The graph data model

TAO: Associations and Objects

TAO TAO is a two-level read-through, write-through cache. TAO is
aware of graph semantic and supports structured queries.

TAO Some numbers ▪ thousands of machines ▪ >1G Ops/s
▪ 97.5% hit rate on followers

Python for TAO

TAO – a fun story

Python for TAO Shard splitting / replication ▪ Extension of
consistent hashing ▪ Based on client machine ID ▪ Wired with the invalidation pipeline Shard placement ▪ Two-level load distribution ▪ Hash table of hot shards mapped to cold servers ▪ Falls back to consistent hashing if shards are not placed ▪ Candidate shards and destinations are identified by Python services

TAO – another story PHP Arrays “An array in PHP
is actually an ordered map. A map is a type that associates values to keys.” – http://php.net/manual/en/language.types.array.php Python dictionaries “Unlike sequences, which are indexed by a range of numbers, dictionaries are indexed by keys, which can be any immutable type; strings and numbers can always be keys. […] It is best to think of a dictionary as an unordered set of key: value pairs, with the requirement that the keys are unique (within one dictionary).” – http://docs.python.org/2/tutorial/datastructures.html

More Python in the Cache infrastructure ▪ FBAR – auto-remediation
engine ▪ Tupperware – job engine used for invalidation pipeline ▪ thrift – language-agnostic service layer, enables many Python clients http://thrift.apache.org/ ▪ Dataswarm – Python frontend to our data warehouse ▪ fbdeploy – job supervisor and BitTorrent deployment

The Facebook Cache: wrapping up

Further reading ▪ Memcache public page: https://www.facebook.com/MemcacheAtFacebook ▪ Memcache paper:
http://bit.ly/fb-memcache-paper ▪ TAO public note: http://bit.ly/tao-blog-post ▪ TAO Paper: http://bit.ly/fb-tao-paper

Memcache: advanced topics

Memcache Thundering herd problem – leases Database Memcache Web Server
Web Server Web Server

Memcache Read after write semantic – remote markers Replica DB
Memcache Web Server Master DB 2. Write to master 3. Delete from memcache 5. Delete remote marker 4. Mysql replication 1. Set remote marker

Memcache Aggregated deletes • Reduce packet rate by 18x. MC
MC MC Aqueduct DB Aqueduct DB Aqueduct DB MC MC MC MC Memcache Routers Memcache Routers MC MC MC MC Memcache Routers Memcache Routers

TAO: advanced topics

Other TAO Topics ▪ TACO ▪ CCW via failover ▪
Two-level cache provide read after write semantic ▪ Two-level cache shields against thundering herd

An Overview of the Facebook Cache (Yannick Ging...

An Overview of the Facebook Cache (Yannick Gingras)

PyCon Canada

More Decks by PyCon Canada

Other Decks in Education

Featured

Transcript

The Facebook Cache Infrastructure Scaling in-memory data stores Yannick Gingras

What is Cache? Why do we cache?

Reasons for caching ▪ Almost 1000:1 read/write ratio ▪ Extreme

Memcache: a key value store

Memcache Memcache is deployed as a demand-filled look-aside key-value cache.

Memcache Data is shared across multiple servers using consistent hashing.

Speed of light is slow Note: not the actual data

Our Memcache syncs via MySQL replication

Memcache Some numbers ▪ Thousands of servers ▪ > 1G

Python for Memcache scaling

Python for Memcache mcconf ▪ Short deployment cycle ▪ Pool

Python for Memcache Adaptive deployments: mcroll and mcpush ▪ Software

The social graph: nodes and edges

The graph data model

TAO: Associations and Objects

TAO TAO is a two-level read-through, write-through cache. TAO is

TAO Some numbers ▪ thousands of machines ▪ >1G Ops/s

Python for TAO

TAO – a fun story

TAO – a fun story

Python for TAO Shard splitting / replication ▪ Extension of

TAO – another story PHP Arrays “An array in PHP

More Python in the Cache infrastructure ▪ FBAR – auto-remediation

The Facebook Cache: wrapping up

Further reading ▪ Memcache public page: https://www.facebook.com/MemcacheAtFacebook ▪ Memcache paper:

(c) 2009 Facebook, Inc. or its licensors. "Facebook" is a

Memcache: advanced topics

Memcache Thundering herd problem – leases Database Memcache Web Server

Memcache Read after write semantic – remote markers Replica DB

Memcache Aggregated deletes • Reduce packet rate by 18x. MC

TAO: advanced topics

Other TAO Topics ▪ TACO ▪ CCW via failover ▪

(c) 2009 Facebook, Inc. or its licensors. "Facebook" is a