Slide 1

Slide 1 text

Chubby   a  Mike  Burrows  Joint Camille  Fournier,  CTO,  Rent  the  Runway   for  Papers!  We!  Love!   @skamille

Slide 2

Slide 2 text

What  is  Chubby? Chubby  is  a  self-­‐described  lock  service   ● Allow  clients  to  synchronize  their  activities   and  agree  on  basic  information  about  their   environment   Help  developers  deal  with  coarse-­‐grained  sync,   in  particular  leader  election

Slide 3

Slide 3 text

How  does  it  work? Solve  distributed  consensus  using  asynchronous   communication   At  its  core,  Paxos

Slide 4

Slide 4 text

Interesting  not  due  to  deep  fundamental   algorithms,  but  due  to  how  you  take  those   fundamental  concepts  and  create  a  system   usable  by  many  apps,  devs,  etc

Slide 5

Slide 5 text

System  Structure

Slide 6

Slide 6 text

System  Structure Definitions:   Chubby  Cell   ● small  set  of  servers  (typically  5)  known  as   replicas   Master   ● the  replica  that  handles  all  writes  and  reads

Slide 7

Slide 7 text

Data  Model:  Files,  Directories,  Handles Exports  a  file  system  interface     /ls/foo/wombat/pouch   ls:  the  Chubby  common  prefix,  stands  for  “lock   service”   foo:  the  name  of  the  chubby  cell  (resolves  to   one  or  more  servers  via  DNS  lookup)   /wombat/pouch:  interpreted  within  named  cell

Slide 8

Slide 8 text

Namespace  details Only  files  and  directories  (collectively,  nodes)   Each  node  has  only  one  name  (no  links)   Nodes  may  be  permanent  or  ephemeral   ● Ephemeral  deleted  if  no  client  has  them  open   ACLs  inherited  from  parent  on  creation

Slide 9

Slide 9 text

More  miscellanea Per-­‐node  metadata  for  bookeeping     instance  number     content  generation  number     lock  generation  number     ACL  generation  number   Handles   ● Obtained  when  a  node  is  opened.  Sequence  number  to  tell   master  which  generation  created  it,  mode  info  to  tell  who   has  handle  open

Slide 10

Slide 10 text

Locks Advisory  rather  than  mandatory   Potential  lock  problems  in  distributed  systems     A  holds  a  lock  L,  issues  request  W,  then  fails     B  acquires  L  (because  A  fails),  performs  actions     W  arrives  (out-­‐of-­‐order)  after  B’s  actions   Solution  #1:  backward  compatible   Lock  server  will  prevent  other  clients  from  getting  the  lock  if  a  lock  become   inaccessible  or  the  holder  has  failed   “Draining  the  queue”  of  unprocessed  events  before  someone  else  can  acquire  the   lock   Solution  #2:  sequencer       A  lock  holder  can  obtain  a  sequencer  from  Chubby       It  attaches  the  sequencer  to  any  requests  that  it  sends  to  other  servers       The  other  servers  can  verify  the  sequencer  information

Slide 11

Slide 11 text

Something  confusing... “The  validity  of  a  sequencer  can  be  checked   against  the  server’s  Chubby  cache  or,  if  the   server  does  not  wish  to  maintain  a  session   with  Chubby,  the  most  recent  sequencer  that   the  server  has  observed”   Wha?

Slide 12

Slide 12 text

Remember:  Locks  are  advisory All  we  guarantee  is  that  locks  conflict  only  with   other  attempts  to  acquire  the  same  lock.  They   do  NOT  make  locked  objects  inaccessible  to   clients  not  holding  their  locks.

Slide 13

Slide 13 text

Events When  you  create  a  handle,  you  can  subscribe  to   events!   File  modified   Child  node  changed   Master  failover   Handle  invalid   Delivered  after  corresponding  action  has  taken   place

Slide 14

Slide 14 text

Caching Clients  cache  file  data  and  node  meta-­‐data  via  in-­‐memory  write-­‐ through  cache   When  node  is  changed,  modification  is  blocked  while  master   invalidates  data  in  all  caches   During  invalidation,  master  treats  node  as  uncachable   Caching  protocol  invalidates  cached  data  on  a  change,  never   updates  it

Slide 15

Slide 15 text

Sessions  and  KeepAlives Session:  Relationship  between  Chubby  client  and   Chubby  cell,  maintained  by  KeepAlives   Created  on  first  contact  with  Chubby  master   Ended  when  terminated,  or  left  idle  with  no   open  handles  or  no  calls

Slide 16

Slide 16 text

KeepAlives Not  quite  heartbeats...   Special  RPC  handled  by  blocking  the  response  until  the  client’s  lease   is  close  to  expiring,  then  allowing  it  to  return  to  the  client  with  the   new  lease   Client  initiates  new  KeepAlive  immediately  upon  receiving  previous   reply   Also  used  to  transmit  events!  This  ensures  that  clients  can’t  maintain   a  session  without  acknowledging  cache  invalidation   Handling  behavior  during  what  might  be  a  “disconnect”  from  the   master  is  done  via  a  grace  period  

Slide 17

Slide 17 text

Master  Fail-­‐over

Slide 18

Slide 18 text

Basically... We  don’t  want  to  expire  all  our  clients  when  the  master   fails  over,  because  re-­‐establishing  sessions  and   redoing  all  the  things  the  clients  do  on  reconnect  is  a   pain  in  the  ass   So  the  client  can’t  do  NEW  work,  but  it  doesn’t  close  its   session,  either

Slide 19

Slide 19 text

“Readers  will  be  unsurprised  to  learn  that  the  fail-­‐over  code,  which  is   exercised  far  less  often  than  other  parts  of  the  system,  has  been  a  rich   source  of  interesting  bugs.” Indeed.

Slide 20

Slide 20 text

Now  on  to  the  interesting  part Design  Rationale

Slide 21

Slide 21 text

Two  Key  Design  Decisions 1. Lock  Service,  as  opposed  to  library  or  service   for  consensus   2. Serves  small  files  to  permit  using  the  service   to  share  data  such  as  advertisement  and   config

Slide 22

Slide 22 text

Why  not  libPaxos? A  client  Paxos  library  would  depend  on  no  other   services...,  and  would  provide  a  standard   framework  for  programmers,  assuming  their   services  can  be  implemented  as  state   machines.  

Slide 23

Slide 23 text

Hell  is  Other  Programmers

Slide 24

Slide 24 text

Service  Advantages:  Part  1 Devs  don’t  plan  for  HA   Code  needs  to  be  specially  structured  for  use   with  consensus  protocols   Service  enables  code  to  have  correct  distributed   locking  without  having  to  rewrite  the  whole   damn  thing

Slide 25

Slide 25 text

Service  Advantages  2,  Electric  Boogaloo When  you  are  electing  a  primary  or  partitioning   data  dynamically,  you  often  need  to  advertise   what  the  state  is   Supporting  the  storage  and  fetching  of  small   quantities  of  data  is  useful!   You  can  do  it  with  DNS  but  DNS  TTL  is  kind  of  a   pain  in  the  ass

Slide 26

Slide 26 text

Service  Advantages  III Programmers  understand  lock-­‐based  interfaces     Sort  of         Not  really   But  hey,  a  familiar  interface  makes  them  use   something  that  works  vs  some  hack  that  they   threw  together!

Slide 27

Slide 27 text

Service  4dvantages   Distributed  consensus  algos  use  quorums  to   make  decisions,  which  means  they  have  to   have  replicas,  which  means  HA   Having  HA  in  the  service  means  the  client  can   make  safe  decisions  even  when  it  does  not   have  its  own  majority!

Slide 28

Slide 28 text

No content

Slide 29

Slide 29 text

Coarse-­‐grained  Locking Lock-­‐acquisition  rate  only  weakly  related  to   transaction  rate  of  client  apps   Locks  acquired  rarely   Lower  load  on  the  system

Slide 30

Slide 30 text

Coarse-­‐grained  Locking Coarse-­‐grained  locks  tend  to  protect  things  that   require  costly  recovery  procedures   Coarse-­‐grained  locks  should  survive  system   failure   If  you  want  fine-­‐grained  locking,  implement  your   own  lock  service  using  Chubby  to  coordinate   blocks  of  lock  groups  to  lock  servers

Slide 31

Slide 31 text

Learnings As  a  product  manager  might  call   them...

Slide 32

Slide 32 text

How  did  people  use  this? Naming!   Most  traffic  is  session  KeepAlives   Some  reads  (from  cache  misses),  few  writes

Slide 33

Slide 33 text

Outages  and  Data  Loss The  network  (maintenance,  issues)  causes   outages   Database  software  errors  and  operator  error   cause  data  loss

Slide 34

Slide 34 text

Performance  sensitivity Clients  rarely  care  about  latency  to  Chubby   provided  sessions  don’t  drop   Extremely  sensitive  to  performance  of  local   Chubby  cache   Server  overloads  above  90,000  sessions  or  due   to  client  spam   Scaling  depends  on  reducing  communication

Slide 35

Slide 35 text

If  ya  like  it  then  you  shoulda  put  a  proxy  in  front  of  it Java  compatibility?  PROXY!  (ok  not  exactly  but  close  enough)   Name  service?  PROXY!   Proxy:   ● Trusted  process  that  passes  requests  from  other   clients  to  a  Chubby  cell   Layer  of  indirection,  allows  different  langs,  different   constraints,  more  load  per  cell

Slide 36

Slide 36 text

Most  people  want  a  Name  Service DNS  is  hard  to  scale  via  TTL   ● 3000  servers  communicating  with  each  other  with  60s  TTL   requires  150K  lookups  per  second   Chubby  can  handle  more,  but  also  name  resolution  doesn’t  need   Chubby-­‐level  preciseness   ● Add  a  proxy  designed  for  name  lookups!

Slide 37

Slide 37 text

Did  I  mention  the  problem  with  other  programmers? They  will  write  loops  that  constantly  retry  failed   commands   They  will  try  to  use  this  as  a  data  storage  system   They  think  that  a  lock  server  makes  for  good   pub/sub  messaging

Slide 38

Slide 38 text

More  difficulties  with  developers They  rarely  consider  availability.   They  don’t  think  about  failure  probabilities.   They  don’t  understand  distributed  systems.   They  blindly  follow  APIs,  don’t  read  the  documentation   carefully.   They  write  bugs.   They  don’t  predict  the  future  very  well.

Slide 39

Slide 39 text

Mitigating  the  impacts  of  developers Review  all  their  code   Review  the  way  they  want  to  use  their  system   Entirely  control  the  client  and  make  bad   behavior  painful   Aggressively  cache

Slide 40

Slide 40 text

In  conclusion... Centralized  service:  Useful  for  many  reasons   Creating  shared  core  architecture  is  hard   Developers  can  and  will  fsck  everything  up   Having  fundamental  insights  and  making   decision  up  front  about  what  you  are  and  are   not  building  helps  you  to  create  something   great

Slide 41

Slide 41 text

fin @skamille