Slide 1

Slide 1 text

Devika  Chawla  –  Director  of  Engineering   George  Abraham:  So;ware  Engineer   Dude,  Where  Are  My   Messages?  

Slide 2

Slide 2 text

1.  Overview   2.  The  road  to  ElasCcsearch   3.  Examples   4.  Q&A   Agenda

Slide 3

Slide 3 text

Netflix Ecosystem

Slide 4

Slide 4 text

Outbound Messaging Channels Emails, Push Notifications and SMS Rejoin Acquisition Account Engagement

Slide 5

Slide 5 text

In App Messaging Channels Notifications and Alerts

Slide 6

Slide 6 text

Messaging Platform Event   Consumer   Algorithms   APNS   Apple   Customer   Service   Billing   Account   PR   Partner   MarkeCng   Message   Processor   Scheduler   In  App   Message   Service   Device   Token   Service   Feedback   Processor   Subscriber   Service   Top   Titles   Video   Metadata   AB   Test   Service   Video   Ranker   GCM   Google   Amazon   SES   Twilio   Events Messages Push Notification Email SMS/Voice In-App Message Message  &   Event   Metadata   Service  

Slide 7

Slide 7 text

Easy Insights: Real Time Rates or Delayed Aggregates

Slide 8

Slide 8 text

Questions to be answered Real Time 1.  Price  Change  Email  status?   2.  Customer’s  Password  Request  message?   3.  OITNB  Push  NoCficaCons  delivered?   4.  Global  distribuCon  of  phone  verificaCon?        

Slide 9

Slide 9 text

No content

Slide 10

Slide 10 text

The Road To Elasticsearch

Slide 11

Slide 11 text

Ability  to  answer  quesCons   in  real  Cme   •  Leveraging  messaging   operaConal  data   – Customer  ID   – Country   – Message  Type  

Slide 12

Slide 12 text

Familiar Story … 1.  Log  parsing     2.  Try  and  leverage  exisCng  soluCons   3.  Build  custom  soluCons   4.  ElasCcsearch  to  the  rescue   Distributed  Grep   •  Specific  paZerns  to  get  a  “feel”   for  issues   •  Not  enough  confidence  since  it   is  basically  low  tech  sampling   Atlas  –  Ne]lix’s  monitoring   system   •  Great  for  trends  and  rates   •  Not  meant  for  tracing   messages   RelaConal  DB   •  Could  aggregate  limited  set  of   dimensions   •  SCll  couldn’t  trace  individual   message  

Slide 13

Slide 13 text

Elasticsearch helps us answer questions

Slide 14

Slide 14 text

No content

Slide 15

Slide 15 text

#netflixeverywhere I.  Customer  Growth   II.  Increases  in  the  types  messages   III.  AddiCon  of  channels   IV.  Pla]orm  growth  to  accommodate  innovaCon  

Slide 16

Slide 16 text

Messaging Platform Event   Consumer   Algo   APNS   Apple   Customer   Service   Billing   Account   PR   Partner   MarkeCng   Message   Processor   Scheduler   In  App   Message   Service   Device   Token   Service   Feedback   Processor   Subscriber   Service   Top   Titles   Video   Metadata   AB   Test   Service   Video   Ranker   GCM   Google   Amazon   SES   Twilio   Events Messages Push Email SMS/Voice In-App Message Message  &   Event   Management   Service  

Slide 17

Slide 17 text

Messaging Platform Evolution Event   Consumer   Algo   APNS   Apple   Customer   Service   Billing   Account   PR   Partner   MarkeCng   Message   Processor   Scheduler   In  App   Message   Service   Device   Token   Service   Feedback   Processor   Subscriber   Service   Top   Titles   Video   Metadata   AB   Test   Service   Video   Ranker   GCM   Google   Amazon   SES   Twilio   Events Messages Push Email SMS/Voice In-App Message Message  &   Event   Metadata   Service  

Slide 18

Slide 18 text

Event   Consumer   Algo   APNS   Apple   Customer   Service   Billing   Account   PR   Partner   MarkeCng   Message   Processor   Scheduler   In  App   Message   Service   Device   Token   Service   Feedback   Processor   Subscriber   Service   Top   Titles   Video   Metadata   AB   Test   Service   Video   Ranker   GCM   Google   Amazon   SES   Twilio   Message  &   Event   Management   Service   Algo   Customer   Service   Billing   Account   PR   Partner   MarkeCng   Event   Consumer   Message   Processor   Scheduler   In  App   Message   Service   Device   Token   Service   Message  &   Event   Management   Service   Subscriber   Service   Top   Titles   Video   Metadata   AB   Test   Service   Video   Ranker   Subscriber   Service   Top   Titles   Video   Metadata   AB   Test   Service   Video   Ranker   Feedback   Processor   Feedback   Processor   Algo   Customer   Service   Billing   Account   PR   Partner   MarkeCng   Event   Consumer   Message   Processor   Scheduler   In  App   Message   Service   Device   Token   Service   Message  &   Event   Metadata   Service   Clusters of application nodes

Slide 19

Slide 19 text

APNS   Apple   GCM   Google   Amazon   SES   Twilio   Across AWS Regions us-­‐east-­‐1   us-­‐west-­‐2   eu-­‐west-­‐1  

Slide 20

Slide 20 text

Tracking nightmare

Slide 21

Slide 21 text

Message Lifecycle

Slide 22

Slide 22 text

Message Lifecycle I.  Way  to  trace  an  event  through  the  pla]orm   II.  Each  component  beacons  records  to  es  as  it  is   processing  an  event   III.  GUIDs  are  used  to  idenCfy  the  enCre  lifecycle   IV.  Complete  visibility  into  the  pla]orm  

Slide 23

Slide 23 text

ElasCcsearch   Started   Done   Done   Started   Done   Started   Done   Event   Consumer   Algo   APNS   Apple   Customer   Service   Billing   Account   PR   Partner   MarkeCng   Message   Processor   Scheduler   In  App   Message   Service   Device   Token   Service   Feedback   Processor   Subscriber   Service   Top   Titles   Video   Metadata   AB   Test   Service   Video   Ranker   GCM   Google   Amazon   SES   Twilio   Events Messages Push Email SMS/Voice Message  &   Event   Metadata   Service   Processed  

Slide 24

Slide 24 text

Tracking a password reset SMS message

Slide 25

Slide 25 text

Query By Customer ID EC  Started   MP  Started   MP  Completed   Feedback  #1   Feedback  #2   Feedback  #3   Twilio Status = queued The message was queued to be sent out by Twilio Twilio Status = sent The message was accepted by the nearest upstream carrier Twilio Status = delivered The carrier has acknowledged that the message was delivered to the handset EC  Completed   EC  Processed   EC  Processed  

Slide 26

Slide 26 text

•  Easily  extendable  as   more  components  or   stages  are  added   •  Numerous  insights   into  the  lifecycle  of  a     message   Ability to Investigate

Slide 27

Slide 27 text

Size on disk

Slide 28

Slide 28 text

Jun  12,   2015  

Slide 29

Slide 29 text

What title was being messaged on June 12, 2015?

Slide 30

Slide 30 text

Tracking  messaging   for  a  Ctle    

Slide 31

Slide 31 text

Monitoring a New Arrival Title Breakdown  by   Status   Breakdown  of   unsent   Country  heat-­‐map   Query   Add  Filters  to  drill   down   Histogram   showing  count  

Slide 32

Slide 32 text

Phone  number   verificaCon  insights  

Slide 33

Slide 33 text

What are the top countries where customers are trying to verify their phones

Slide 34

Slide 34 text

What are the top countries where customers are trying to verify their phones

Slide 35

Slide 35 text

How  successful  is  the  verify   phone  SMS  in  those   countries  

Slide 36

Slide 36 text

No content

Slide 37

Slide 37 text

How successful is the verify phone sms?

Slide 38

Slide 38 text

What  is  happening  in   Brazil?  

Slide 39

Slide 39 text

No content

Slide 40

Slide 40 text

What are the top errors for SMS delivery in BR?

Slide 41

Slide 41 text

Any guesses why Brazil has an unusually high number of SMS delivery failures?

Slide 42

Slide 42 text

Q U E S T I O N S ?

Slide 43

Slide 43 text

IntegraCon  with   ElasCcsearch  API  

Slide 44

Slide 44 text

Wings Integration

Slide 45

Slide 45 text

Reporting Tab                         Query  Builder                           Metrics  on  various  dimensions  in   the  context  of  this  message   Time  period                        

Slide 46

Slide 46 text

Query and Metrics                                                                                                                                        

Slide 47

Slide 47 text

Elasticsearch Data                                                                                                

Slide 48

Slide 48 text

Payload details

Slide 49

Slide 49 text

Preview with real time enrichment

Slide 50

Slide 50 text

Q U E S T I O N S ?

Slide 51

Slide 51 text

Backend Master   Node   Tribe   Node   Data   Node   Data   Node   Data   Node   Data   Node   Data   Node   Data   Node   Data   Node   Data   Node   Data   Node   Data   Node   Data   Node   Data   Node   Data   Node   Data   Node   Master   Node   Master   Node   Tribe   Node   Tribe   Node   Tribe   Node   Tribe   Node   Tribe   Node   us-­‐east-­‐1   6  (r3.xlarge)   6  (m3.xlarge)   66  (i2.xlarge)  

Slide 52

Slide 52 text

us-­‐east-­‐1   6  (r3.xlarge)   3  (m3.xlarge)   66  (i2.xlarge)   3  (m3.xlarge)   3  (m3.xlarge)   24  (i2.xlarge)   24  (i2.xlarge)   us-­‐west-­‐2   eu-­‐west-­‐1  

Slide 53

Slide 53 text

1.  ES  version  1.5.2   2.  Kibana  4.0.1   3.  Time-­‐based  rotaCng  daily  indices     –  14  day  retenCon   4.  Clusters  are  sized  so  that  data  nodes  have   about  40%  free  space     ES Backend Details

Slide 54

Slide 54 text

Size on disk

Slide 55

Slide 55 text

Before Elasticsearch

Slide 56

Slide 56 text

Home Grown Tools – Distributed Grep 2

Slide 57

Slide 57 text

Home Grown Tools – Distributed Grep 2

Slide 58

Slide 58 text

EMPUI – The first attempt 2

Slide 59

Slide 59 text

EMPUI – The first attempt 2

Slide 60

Slide 60 text

Atlas Metrics