Slide 1

Slide 1 text

No content

Slide 2

Slide 2 text

Michael  Laing   Architect   New  York  Times   [email protected]    

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

Oregon Dublin Tokyo São Paulo

Slide 8

Slide 8 text

Oregon Dublin Tokyo São Paulo a a a a b b b b c c

Slide 9

Slide 9 text

A  Global  Mesh  with  a  Memory   Message-­‐based:  WebSocket,  AMQP,  SockJS   If  in  doubt:   •  Resend   •  Reconnect   •  Reread   Idempotent:   •  Replicating   •  Racy   •  Resolving   Classes  of  service:   •  Gold:  replicate/race/resolve   •  Silver:  prioritize   •  Bronze:  queueable   Millions  of  users   Event-­‐driven:  async  using  libev  

Slide 10

Slide 10 text

Message:  an  event  with  data   §  Envelope:  Routing  while  in  motion  &  Locating  when  at  rest   §  Metadata   §  Body  (opaque  to  us)   Metadata Body (may be absent) Message Envelope

Slide 11

Slide 11 text

Message:  an  event  with  data   RabbitMQ WebSocket S3 / CloudFront Cassandra Envelope Routing Key Gateway Connection UUID “Path” & UUID Metadata Headers: Map / Array JSON HTTP Headers JSON Body Blob Blob Blob Blob

Slide 12

Slide 12 text

Publish   Message Core Cassandra S3 / Cloud Front Gateway Device Init AMQP CQL WebSocket HTTP sync

Slide 13

Slide 13 text

Subscribe   Message Core Cassandra S3 / Cloud Front Gateway Device Init AMQP CQL WebSocket HTTP

Slide 14

Slide 14 text

Dismiss   Message Core Cassandra Gateway Device Init AMQP CQL WebSocket Core Gateway Device Cassandra

Slide 15

Slide 15 text

Core Core Core Core Gateway Device Message S3 / Cloud Front dozens dozens millions millions millions several Cassandra dozens S3 / Cloud Front S3 / Cloud Front S3 / Cloud Front Gateway Gateway Gateway Gateway Gateway Gateway Cassandra Cassandra Cassandra Cassandra Cassandra Device Device Device Device Device Device Device Device Device Device Device Device Message Message Message Message Message Message Message Message Message Message Message Message Connect  

Slide 16

Slide 16 text

Envelope  –  2  forms  of  addressing   §  “Path”:  1)  Routing  a  message  to  a  user  2)  Finding  a  message  for  a  user           Message nyt⨍aбrik

Slide 17

Slide 17 text

Envelope  –  2  forms  of  addressing   §  “Path”:  1)  Routing  a  message  to  a  user  2)  Finding  a  message  for  a  user         §  “PostofFice”:  Routing  a  message  internally  in  the  nyt⨍aбrik     Message nyt⨍aбrik Core Gateway Core Gateway

Slide 18

Slide 18 text

The  Path  hierarchy   §  Path  elements  are  text  (utf-­‐8  but  “.”  is  reserved)  –  the  1st  element  is   the  “category”     “category”: “feeds”, “2nd element”: “breaking-news” “3rd element”: “0012345”

Slide 19

Slide 19 text

The  Path  hierarchy   §  Path  elements  are  text  (utf-­‐8  but  “.”  is  reserved)  –  the  1st  element  is   the  “category”     “category”: “feeds”, “2nd element”: “breaking-news” “3rd element”: “0012345”   §  The  elements  are  joined  by  “.”  for  routing   “path”: “feeds.breaking-news.00123456”

Slide 20

Slide 20 text

Deeper  into  the  Path  hierarchy   §  For  persistence,  the  path  denotes  a  sorted  “folder”  containing   messages  in  reverse  datetime  order  (using  the  timestamp  from  the   version  1  uuid  uniquely  identifying  each  message)   “feeds.breaking-news.56”/bd1961f5-1062-11e4-a630-406c8f1838fa “feeds.breaking-news.56”/b94e8b45-1062-11e4-900d-406c8f1838fa

Slide 21

Slide 21 text

Deeper  into  the  Path  hierarchy   §  For  persistence,  the  path  denotes  a  sorted  “folder”  containing   messages  in  reverse  datetime  order  (using  the  timestamp  from  the   version  1  uuid  uniquely  identifying  each  message)   “feeds.breaking-news.56”/bd1961f5-1062-11e4-a630-406c8f1838fa “feeds.breaking-news.56”/b94e8b45-1062-11e4-900d-406c8f1838fa   §  Subscribing  to  a  path  is  done  by  “binding”,  typically  with  wildcards:     “*”  matches  any  one  element,  “#”  matches  any  sequence  of  elements   All  breaking-­‐news  messages:  “feeds.breaking-news.#”

Slide 22

Slide 22 text

More  on  subscribing  &  retrieving   §  Retrieving  from  persistent  storage  can  be  done  by  path,  e.g.  the   “latest”  breaking-­‐news  messages  for  item  56:     “feeds.breaking-news.56”

Slide 23

Slide 23 text

More  on  subscribing  &  retrieving   §  Retrieving  from  persistent  storage  can  be  done  by  path,  e.g.  the   “latest”  breaking-­‐news  messages  for  item  56:     “feeds.breaking-news.56”   §  But  retrieval  can  also  be  done  using  trailing  wild  cards:   “feeds.breaking-news.#” will  return  the  “latest”  breaking-­‐news   messages  for  all  “current”  items  

Slide 24

Slide 24 text

More  on  subscribing  &  retrieving   §  Retrieving  from  persistent  storage  can  be  done  by  path,  e.g.  the   “latest”  breaking-­‐news  messages  for  item  56:     “feeds.breaking-news.56”   §  But  retrieval  can  also  be  done  using  trailing  wild  cards:   “feeds.breaking-news.#” will  return  the  “latest”  breaking-­‐news   messages  for  all  “current”  items     §  The  Cassandra  data  store  is  designed  to  return  hierarchical  queries   with  a  single  request  and  in  the  desired  order  

Slide 25

Slide 25 text

A  notable  simpliFication:   §  Paths  for  subscribing  to  messages  and  paths  for  retrieving  persisted   messages,  including  the  use  of  wild  cards,  are  the  same,  e.g.:  

Slide 26

Slide 26 text

A  notable  simpliFication:   §  Paths  for  subscribing  to  messages  and  paths  for  retrieving  persisted   messages,  including  the  use  of  wild  cards,  are  the  same,  e.g.:   When  a  user  logs  in  she  is  “subscribed”  using  her  ID;  messages   “published”  to  her  will  be  received  while  “persisted”  messages  and   subscription  preferences  are  retrieved  (a  few  10’s  of  milliseconds)  

Slide 27

Slide 27 text

A  notable  simpliFication:   §  Paths  for  subscribing  to  messages  and  paths  for  retrieving  persisted   messages,  including  the  use  of  wild  cards,  are  the  same,  e.g.:   When  a  user  logs  in  she  is  “subscribed”  using  her  ID;  messages   “published”  to  her  will  be  received  while  “persisted”  messages  and   subscription  preferences  are  retrieved  (a  few  10’s  of  milliseconds)     Once  subscription  preferences  arrive,  she  will  be  “subscribed”  to  them     and  any  corresponding  “persisted”  messages  retrieved     The  same  paths  are  used  for  subscription  and  retrieval  

Slide 28

Slide 28 text

Special  Paths  for  individual  routing   §  Our  subscribers  (millions  of  them)  have  numeric  IDs  –  using  those  IDs   directly  for  routing,  specifically  for  the  “binding”  function,  would  be   inefficient   “id.prefs.09067832” (namespace  of  3rd  element  is  too  large)  

Slide 29

Slide 29 text

Special  Paths  for  individual  routing   §  Our  subscribers  (millions  of  them)  have  numeric  IDs  –  using  those  IDs   directly  for  routing,  specifically  for  the  “binding”  function,  would  be   inefficient   “id.prefs.09067832” (namespace  of  3rd  element  is  too  large)   §  Instead  we  convert  the  ID  to  base62  elements  and  take  advantage  of   the  patricia  trie  search  structures  built  into  RabbitMQ  and  our   gateway   “id.prefs.c.2.x.M” (equivalent  to  the  above,  used  for  routing)  

Slide 30

Slide 30 text

PostofFice  addressing   §  The  “postoffice”  is  a  logical   “bus”  that  connects  all  the   services  in  all  the  nyt⨍aбrik   instances  globally   Gateway Core Gateway Gateway Core Gateway postoffice logical view

Slide 31

Slide 31 text

PostofFice  addressing   §  The  “postoffice”  is  a  logical   “bus”  that  connects  all  the   services  in  all  the  nyt⨍aбrik   instances  globally   §  It  is  physically  segmented   and  the  segments  are   connected  using  RabbitMQ   “federation”   Gateway Core Gateway Gateway Core Gateway postoffice logical view

Slide 32

Slide 32 text

PostofFice  address  elements   §  Each  nyt⨍aбrik service has 3 basic uniquifying elements: “region”: “us-west-2”, “instance”: “i-123”, “pid”: “12”

Slide 33

Slide 33 text

PostofFice  address  elements   §  Each  nyt⨍aбrik service has 3 basic uniquifying elements: “region”: “us-west-2”, “instance”: “i-123”, “pid”: “12” §  And  some  additional  qualifiers:   “product”: “search”, “service”: “route”

Slide 34

Slide 34 text

PostofFice  routing  key   §  Each  routing  key  has  a  “from”   address  embedded  in  it: “region”: “us-west-2”, “instance”: “i-123”, “pid”: “12”,   “product”: “search”, “service”: “resolve”

Slide 35

Slide 35 text

PostofFice  routing  key   §  Each  routing  key  has  a  “from”   address  embedded  in  it: “region”: “us-west-2”, “instance”: “i-123”, “pid”: “12”,   “product”: “search”, “service”: “resolve” §  And  a  “to”  address:   “region”: “us-west-2”, “instance”: “-”, “pid”: “-”,   “product”: “search”, “service”: “route” (the  “–”  means  “any”)

Slide 36

Slide 36 text

PostofFice  routing  key   §  Each  routing  key  has  a  “from”   address  embedded  in  it: “region”: “us-west-2”, “instance”: “i-123”, “pid”: “12”,   “product”: “search”, “service”: “resolve” §  And  a  “to”  address:   “region”: “us-west-2”, “instance”: “-”, “pid”: “-”,   “product”: “search”, “service”: “route” §  And  an  “action”:  “action”: “route” (the  “–”  means  “any”)

Slide 37

Slide 37 text

PostofFice  routing  key  detail   §  And  they  are  put  together  as  an  ordered  sequence  like  this:   ..

Slide 38

Slide 38 text

PostofFice  routing  key  detail   §  And  they  are  put  together  as  an  ordered  sequence  like  this:   .. “route.\ us-west-2.search.resolve.i-123.12.\ us-west-2.search.route.-.-”

Slide 39

Slide 39 text

PostofFice  routing  key  detail   §  And  they  are  put  together  as  an  ordered  sequence  like  this:   .. “route.\ us-west-2.search.resolve.i-123.12.\ us-west-2.search.route.-.-” §  Meaning:  This  is  a  request  for  a  “route”  action  from  a  specific   invocation  of  the  “search”  product  “resolve”  service   addressed  to  any  “search”  product  “route”  service  in  region   “us-­‐west-­‐2”

Slide 40

Slide 40 text

PostofFice  binding   §  Each  service  invocation  “binds”  (subscribes)  to  the  postoffice   using  its  unique  address  to  get  messages  specifically  directed   to  it,  e.g.  asynchronous  RPC  responses   .. “*.\ *.*.*.*.*.\ us-west-2.search.route.i-123.12”

Slide 41

Slide 41 text

PostofFice  binding  for  services   §  Each  service  invocation  also  “binds”  to  the  postoffice  using   addresses  that  will  select  messages  appropriate  for  its   service   .. “route.\ us-west-2.*.*.*.*.\ *.*.route.*.*”

Slide 42

Slide 42 text

PostofFice  binding  for  services   §  Each  service  invocation  also  “binds”  to  the  postoffice  using   addresses  that  will  select  messages  appropriate  for  its   service   .. “route.\ us-west-2.*.*.*.*.\ *.*.route.*.*” §  All  this  address  manipulation  is  handled  by  common   methods  in  the  nyt⨍aбrik

Slide 43

Slide 43 text

Routing  in  the  Core   §  For  load  balancing  on  entry  to  the  nyt⨍aбrik  Core Message Core Core or

Slide 44

Slide 44 text

Routing  in  the  Core   §  For  replication  of  important  (gold  service)  messages Message Core Core and

Slide 45

Slide 45 text

Routing  in  the  Core   §  For  distribution  to  all  consumers Core Core Gateway Device Gateway Device

Slide 46

Slide 46 text

Problems  with  Core  instances   §  Complex  connectivity:  N(N-­‐1)  federation  +  clustering  +  …    

Slide 47

Slide 47 text

Problems  with  Core  instances   §  Complex  connectivity:  N(N-­‐1)  federation  +  clustering  +  …     §  Many  services:  input,  process,  resolve,  reject,  cache_push,  …    

Slide 48

Slide 48 text

Problems  with  Core  instances   §  Complex  connectivity:  N(N-­‐1)  federation  +  clustering  +  …     §  Many  services:  input,  process,  resolve,  reject,  cache_push,  …     §  Hence,  problematic  to  manage  

Slide 49

Slide 49 text

Problems  with  Core  instances   §  Complex  connectivity:  N(N-­‐1)  federation  +  clustering  +  …     §  Many  services:  input,  process,  resolve,  reject,  cache_push,  …     §  Hence,  problematic  to  manage   §  And  difficult  to  autoscale  

Slide 50

Slide 50 text

Possible  solution:  refactor  and  simplify   §  A  new  component,  the  Rabbit  Router,  to  focus  on   connectivity  and  routing    

Slide 51

Slide 51 text

Possible  solution:  refactor  and  simplify   §  A  new  component,  the  Rabbit  Router,  to  focus  on   connectivity  and  routing     §  A  New  Core,  with  a  focus  on  services    

Slide 52

Slide 52 text

Possible  solution:  refactor  and  simplify   §  A  new  component,  the  Rabbit  Router,  to  focus  on   connectivity  and  routing     §  A  New  Core,  with  a  focus  on  services     §  Everything  connected  to  a  Rabbit  Router  

Slide 53

Slide 53 text

Possible  solution:  refactor  and  simplify   §  A  new  component,  the  Rabbit  Router,  to  focus  on   connectivity  and  routing     §  A  New  Core,  with  a  focus  on  services     §  Everything  connected  to  a  Rabbit  Router   §  The  “bus”  becomes  a  “star”  

Slide 54

Slide 54 text

No content