Slide 1

Slide 1 text

Smarter  Caching  With  Pequod   Yandong  Mao   Neha  Narula   Robert  Morris   Bryan  Kate   Michael  Kester   Eddie  Kohler   Neha  Narula   May  13,  2013   @neha   1  

Slide 2

Slide 2 text

2   academia  

Slide 3

Slide 3 text

ApplicaGon-­‐level     Caching     Is  Useful   3  

Slide 4

Slide 4 text

4  

Slide 5

Slide 5 text

Cache   App   DB   Cache  reads   Reads   Writes   5  

Slide 6

Slide 6 text

ApplicaGon  ComputaGon   6   show_timeline(user): get user-timeline if present return timeline else get user-following list if not present query db user-following list put user-following list in cache for each f in user-following list get f-posts if not present query db f-posts put f-posts in cache posts = posts + f-posts user-timeline = sort posts put user-timeline into cache return user-timeline post(poster, tweet): insert tweet into database append tweet to poster-posts get poster-followers if not present query db put followers in cache for each f in followers: get f-timeline if present update timeline lock put new timeline into cache unlock

Slide 7

Slide 7 text

cache  joins   7  

Slide 8

Slide 8 text

Cache   App   DB   Cache  reads   Reads   Writes   8  

Slide 9

Slide 9 text

Pequod   App   DB   Cache  reads   Writes   Cache  Joins   9  

Slide 10

Slide 10 text

Pequod   •  In  memory  key/value  range  store   scan(k0 ,k1 )   get(k)   put(k,v) install_join(cache_join) •  App  developer  specifies  cache  joins   •  ComputaGon  on  demand  (or  not)     10  

Slide 11

Slide 11 text

What  Pequod  Can  Do  With  Cache  Joins   compute  results  automaGcally   subscribe  to  updates  in  a  range  of  data   keep  cached  results  fresh   easily  use  different  caching  strategies   interleave  different  types  of  data   11  

Slide 12

Slide 12 text

Compute  Twi]er  Timeline   neha   argv0   neha’s   Timeline   Time   basho   jusGnbieber   ladygaga   Posts   SubscripGons   SELECT * FROM posts, subscriptions WHERE posts.poster = subscriptions.poster AND subscriptions.user = “neha” AND ts < posts.timestamp ORDER BY timestamp DESC 12  

Slide 13

Slide 13 text

Tell  Pequod   Gmelines  =  posts  X  subscripGons   13  

Slide 14

Slide 14 text

All  Timelines  in  Pequod   olive’s   Gmeline   peter’s   Gmeline   14   max’s   Gmeline   neha’s   Gmeline  

Slide 15

Slide 15 text

[ table | column | column … ] posts | | subscriptions | | 15   Pequod  keys  embed  table  names   and  column  values  

Slide 16

Slide 16 text

Post  Keys   p  |  jusGnbieber  |  207   Do  you  belieb?   posts | | 16  

Slide 17

Slide 17 text

SubscripGon  Key  Range   subscriptions | | 17   s  |  neha  |  basho           s  |  neha  |  jusGnbieber   s  |  neha  |  ladygaga  

Slide 18

Slide 18 text

Timeline  Keys   t  |  neha  |207  |  jusGnbieber   Do  you  belieb?   timeline | | | 18  

Slide 19

Slide 19 text

Twi]er  Timeline  Cache  Join   Pequod.install_join( “ t||| = copy p|| using s|| ”) 19   5melines  =  posts  X  subscrip5ons  

Slide 20

Slide 20 text

t||| = copy p|| using s|| Twi]er  Timeline  Cache  Join   Seconday   Source   Sink   20   s|neha|argv0   s|neha|basho   s|neha|jusGnbieber   s|neha|ladygaga   p|argv0|101   p|basho|104   p|jusGnbieber|123   p|ladygaga|198   t|neha   Primary   Source  

Slide 21

Slide 21 text

Compute  results  automa5cally   21  

Slide 22

Slide 22 text

t|max|201   t|olive|99   Ask  For  A  Range,  Any  Range   scan(“t|neha|100”, “t|neha|+”) Where   neha’s   Gmeline   should  be   22  

Slide 23

Slide 23 text

Use  Lookup  Source   using s|neha| neha’s   subscripGons   s|neha|argv0   s|neha|basho   s|neha|jusGnbieber   s|neha|ladygaga   23  

Slide 24

Slide 24 text

Find  Primary  Source   using s|neha|argv0 s|neha|argv0   s|neha|basho   s|neha|jusGnbieber   s|neha|ladygaga   p|argv0|101   p|argv0|104   p|argv0|115   tweet   tweet   tweet   argv0’s  posts   24  

Slide 25

Slide 25 text

Copy  Primary  Source   copy p|argv0|101 neha’s   Gmeline   t|neha|104|argv0   t|neha|115|argv0   t|neha|101|argv0   tweet   tweet   tweet   25  

Slide 26

Slide 26 text

Use  Lookup  Source   using s|neha|basho s|neha|argv0   s|neha|basho   s|neha|jusGnbieber   s|neha|ladygaga   p|basho|102   tweet   basho’s  posts   26  

Slide 27

Slide 27 text

Copy  Primary  Source   copy p|basho|102 neha’s   Gmeline   t|neha|104|argv0   t|neha|115|argv0   t|neha|102|basho   t|neha|101|argv0   tweet   tweet   tweet   tweet   27  

Slide 28

Slide 28 text

Use  Lookup  Source   using s|neha|justinbieber s|neha|argv0   s|neha|basho   s|neha|jusGnbieber   s|neha|ladygaga   p|jusGnbieber|207   tweet   jusGnbieber’s  posts   28  

Slide 29

Slide 29 text

Copy  Primary  Source   copy p|justinbieber|207 neha’s   Gmeline   t|neha|104|argv0   t|neha|115|argv0   t|neha|207|jusGnbieber   t|neha|102|basho   t|neha|101|argv0   tweet   tweet   tweet   tweet   tweet   29  

Slide 30

Slide 30 text

Use  Lookup  Source   using s|neha|ladygaga s|neha|argv0   s|neha|basho   s|neha|jusGnbieber   s|neha|ladygaga   p|ladygaga|209   tweet   ladygaga’s  posts   30  

Slide 31

Slide 31 text

Copy  Primary  Source   copy p|ladygaga|209 neha’s   Gmeline   t|neha|104|argv0   t|neha|115|argv0   t|neha|207|jusGnbieber   t|neha|209|ladygaga   t|neha|102|basho   t|neha|101|argv0   tweet   tweet   tweet   tweet   tweet   tweet   31  

Slide 32

Slide 32 text

Pequod  uses  cache  joins  to   automaGcally  create  cached   objects.   32  

Slide 33

Slide 33 text

New   Tweets   33  

Slide 34

Slide 34 text

Subscribe  to  range  updates   34  

Slide 35

Slide 35 text

Cached  Timeline   neha’s   Gmeline   t|neha|104|argv0   t|neha|115|argv0   t|neha|207|jusGnbieber   t|neha|209|ladygaga   t|neha|102|basho   t|neha|101|argv0   tweet   tweet   tweet   tweet   tweet   tweet   35  

Slide 36

Slide 36 text

Update,  New  Post   put(“p|basho|215”, “At RICON East WOOO!”) p|basho|102   tweet   basho’s  posts   p|basho|215   At  RICON  East   WOOO!   Update   t|neha   36  

Slide 37

Slide 37 text

Copy  New  Post   t||| = copy p|| using s|| neha’s   Gmeline   t|neha|104|argv0   t|neha|115|argv0   t|neha|207|jusGnbieber   t|neha|209|ladygaga   t|neha|102|basho   t|neha|101|argv0   tweet   tweet   tweet   tweet   tweet   tweet   t|neha|215|basho   At  RICON  East  WOOO!   37   Update   t|neha  

Slide 38

Slide 38 text

Update,  New  SubscripGon   put(“s|neha|xexd”, None) neha’s   subscripGons   s|neha|argv0   s|neha|basho   s|neha|jusGnbieber   s|neha|ladygaga   s|neha|xexd   Invalidate   t|neha   38  

Slide 39

Slide 39 text

Invalidate  Sink   neha’s   Gmeline   t|neha|104|argv0   t|neha|115|argv0   t|neha|207|jusGnbieber   t|neha|209|ladygaga   t|neha|102|basho   t|neha|101|argv0   tweet   tweet   tweet   tweet   tweet   tweet   t|neha|215|basho   At  RICON  East  WOOO!   39   Invalidate   t|neha  

Slide 40

Slide 40 text

Invalidate  Sink   t|neha|104|argv0   t|neha|115|argv0   t|neha|207|jusGnbieber   t|neha|209|ladygaga   t|neha|102|basho   t|neha|101|argv0   tweet   tweet   tweet   tweet   tweet   tweet   t|neha|215|basho   At  RICON  East  WOOO!   40   s|neha|xexd  

Slide 41

Slide 41 text

t|neha|104|argv0   t|neha|115|argv0   t|neha|207|jusGnbieber   t|neha|209|ladygaga   t|neha|102|basho   t|neha|101|argv0   tweet   tweet   tweet   tweet   tweet   tweet   41   s|neha|xexd   scan(“t|neha|209”, “t|neha|+”) t|neha|213|xexd   tweet   t|neha|215|basho   At  RICON  East  WOOO!   scan  

Slide 42

Slide 42 text

t|neha|209|ladygaga   tweet   t|neha|213|xexd   42   t|neha|215|basho   At  RICON  East  WOOO!   tweet   s|neha|xexd   scan(“t|neha|209”, “t|neha|+”) tweet   tweet   tweet   tweet   tweet   t|neha|104|argv0   t|neha|115|argv0   t|neha|207|jusGnbieber   t|neha|102|basho   t|neha|101|argv0   scan  

Slide 43

Slide 43 text

Updaters   •  On  put(),  updater  gets  source  key,  new   value,  old  value   •  Updaters  on  primary  sources  immediately   update  the  relevant  sink  keys   •  Updates  on  secondary  sources  log  update  on   sink  and  invalidate  sink  keys,  to  be  fixed  up  on   next  scan   43  

Slide 44

Slide 44 text

Clients  can  use  Pequod  to   subscribe  to  updates  in  a   range.   44  

Slide 45

Slide 45 text

45   38  MILLION   WRITES!  

Slide 46

Slide 46 text

Use  different  caching  strategies   46  

Slide 47

Slide 47 text

Twi]er  Timeline  Cache  Join   Pequod.install_join( “ t||| = copy p|| using s|| ”) 47   pull For  celebrity  posts  and   Gmelines  of  users  who   aren’t  logged  in  

Slide 48

Slide 48 text

Pull  CelebriGes  Each  Time   neha’s   Gmeline   t|neha|104|argv0   t|neha|115|argv0   t|neha|102|basho   t|neha|101|argv0   tweet   tweet   tweet   tweet   48   t|max|201   t|olive|99  

Slide 49

Slide 49 text

Pull  CelebriGes  Each  Time   t|neha|104|argv0   t|neha|115|argv0   t|neha|102|basho   t|neha|101|argv0   tweet   tweet   tweet   tweet   49   scan(“t|neha|100”, “t|neha|+”) p|jusGnbieber|207   tweet   p|ladygaga|209   tweet  

Slide 50

Slide 50 text

Pequod  makes  it  easy  to   switch  between  caching   strategies.   50  

Slide 51

Slide 51 text

Keep  cached  results  fresh   51  

Slide 52

Slide 52 text

User   Karma   52  

Slide 53

Slide 53 text

User  Karma   karma| = count votes|| 53   sum max min

Slide 54

Slide 54 text

AutomaGcally  Update   Votes   54   Karma   Cache   Join   new  vote  

Slide 55

Slide 55 text

Pequod  supports   incremental  operaGons  to   keep  results  so  fresh   (and  so  clean  clean).   55  

Slide 56

Slide 56 text

Interleave  different  types  of  data   56  

Slide 57

Slide 57 text

57  

Slide 58

Slide 58 text

Many  Requests   ArGcles   Comments   Votes   App   EnGre   Cached   Page   58  

Slide 59

Slide 59 text

Inline  Cache  Joins   page||votes = count votes| page||article = copy articles| 59   page||comments| = copy comments||

Slide 60

Slide 60 text

Interleaved  Data   ArGcles   Comments   Votes   One     scan!   60   EnGre   Cached   Page   Cache   Join   Cache   Join   Cache   Join  

Slide 61

Slide 61 text

Pequod  makes  it  easy  to   interleave  different  types  of   data.   61  

Slide 62

Slide 62 text

ImplementaGon   •  C++  single-­‐threaded,  event  driven  server   •  Range  store:  trie  of  red-­‐black  trees   62   posts   subscripGons  

Slide 63

Slide 63 text

OpGmizaGon:  Sink  Hints   Hint   40%   63   Gmelines  

Slide 64

Slide 64 text

OpGmizaGon:    Value  Sharing   t|argv0|215|basho   ptr   t|ladygaga|215|basho   t|neha|215|basho   ptr   ptr   p|basho|215   At  RICON  East   WOOO!   64   17%  

Slide 65

Slide 65 text

EvaluaGon   •  QPS  compared  to  other  systems   •  Twi]er  caching  strategies   •  Pequod  compared  to  client-­‐managed  caching   •  Benefit  of  opGmizaGons   65  

Slide 66

Slide 66 text

Setup   •  12  core  machine  (2  3.47Ghz  Intel  Xeon  X5960   chips  with  2  hyperthreads  per  core)   •  Linux  3.2.0,  96  GB  memory   •  Redis  2.4.14,  PostgreSQL  9.1.8   66  

Slide 67

Slide 67 text

Workloads   •  Twi]er     – microbenchmark:  2K  users,  each  Gmeline  request   returns  ~20  new  tweets.    1M  posts,  1M  reads   – ~real:  1.8M  users,  72M  relaGonships.  (see  paper)   •  News  Site   – 100K  arGcles,  50K  users,  1M  comments,  2M  votes     – 4M  requests,  1%  comment  rate,  varying  vote   rates     67  

Slide 68

Slide 68 text

QPS  Comparison   68   0   20   40   60   80   100   120   News  Site   Twi]er   Thousands   PostgreSQL   Redis   Pequod   Queries  per  second   CPU  UGlizaGon   at  19X  

Slide 69

Slide 69 text

Twi]er  Hybrid  Push/Pull   69   0   5   10   15   20   25   30   1   5   10   20   40   60   80   90   95   100   Pull   Push   Hybrid   Total  RunGme  (seconds)   Percentage  of  AcGve  Users  

Slide 70

Slide 70 text

Client  Managed  vs.  Pequod   70   0   10   20   30   40   50   60   70   Client-­‐managed   Pequod   Other   Post   Timeline   RPC   Overheads   Total  RunGme  (seconds)  

Slide 71

Slide 71 text

Client  Managed  vs.  Pequod   71   0   10   20   30   40   50   60   70   Client-­‐managed   Pequod   Other   Post   Timeline   InserGng   New  Posts   Total  RunGme  (seconds)  

Slide 72

Slide 72 text

Benefit  of  Sink  Hints  and  Value  Sharing   72   0   5   10   15   20   25   30   35   40   45   UnopGmized  Pequod   Pequod   Other   Post   Timeline   Total  RunGme  (seconds)  

Slide 73

Slide 73 text

Related  Work   •  InvalidaGng  the  cache   – DUP,  TxCache,  Scaling  Memcached  at  Facebook   •  Materialized  Views   – Maintenance,  Dynamic  Materialized  Views  from   Microsos   •  Push/pull   – Twi]er  Gmeline  REST  service   73  

Slide 74

Slide 74 text

Open  QuesGons   •  EvicGon   •  Cross-­‐server  cache  joins   •  Create  efficient  cache  joins  automagically   •  Support  more  computaGon   •  Non-­‐ordered  store   •  Full-­‐fledged  datastore   74  

Slide 75

Slide 75 text

Pequod  Cache  Joins     Features  of  a  database,   Performance  of  a  cache     @neha   [email protected]   75