Slide 1

Slide 1 text

ANALYTICS

Slide 2

Slide 2 text

• Local social network • Web, iOS and Android • 1.3M users in uk. • 1.5B events

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

No content

Slide 8

Slide 8 text

analytics are core for our business

Slide 9

Slide 9 text

we track millions of events a day

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

No content

Slide 12

Slide 12 text

No content

Slide 13

Slide 13 text

No content

Slide 14

Slide 14 text

Subject: Here's what your neighbours are talking about... To: [email protected]

Slide 15

Slide 15 text

Subject: Here's what your neighbours are talking about... To: [email protected]

Slide 16

Slide 16 text

Subject: Here's what your neighbours are talking about... To: [email protected]

Slide 17

Slide 17 text

Susan P.

Slide 18

Slide 18 text

Susan P.

Slide 19

Slide 19 text

Susan P.

Slide 20

Slide 20 text

we currently use the batch pipeline Redshift

Slide 21

Slide 21 text

we gather insights once a day Redshift user_stats message_stats SQL

Slide 22

Slide 22 text

last_visit last_email_open 2016/01/15 12:33:54 2016/07/12 22:13:10 2016/07/10 04:52:01 2016/03/11 21:01:03 2016/07/11 05:34:45 2016/07/12 14:30:21

Slide 23

Slide 23 text

day impressions 2016/01/15 engagments id 1 u_impressions u_engagments 345 122 299 99 2016/01/15 2 123 99 94 84 2016/01/15 3 934 845 843 789 2016/01/14 1 899 777 744 645 2016/01/14 2 754 543 634 433 2016/01/14 3 103 99 91 77 2016/01/13 1 499 301 382 235 2016/01/13 2 1893 1400 991 1099

Slide 24

Slide 24 text

With this data we give businesses insights Your message got 890 impressions and 350 engagements.

Slide 25

Slide 25 text

We tune each message's reach dynamically 2000 target 1200 current >

Slide 26

Slide 26 text

It works great, but the data has several hours of lag event ingest Load SQL View S3 Load

Slide 27

Slide 27 text

We are going to start using the realtime pipeline realtime analytics realtime content router

Slide 28

Slide 28 text

realtime pipeline Redshift Kinesis λ

Slide 29

Slide 29 text

Kinesis λ λ Dynamodb Lambda Lambda

Slide 30

Slide 30 text

The latency from events happening to us acting on them would be nearly zero event ingest Kinesis Lambda DynamoDB Load

Slide 31

Slide 31 text

AWS Lambdas: 20,000 feet view .py .js .java DEPENDENCIES ZIP AWS S3 λ

Slide 32

Slide 32 text

λ • Zero-administration compute platform • Connect Lambdas with AWS services • Kinesis, DynamoDB, APIGateway, S3, CloudWatch, Cron… • Pricing based on usage: • ~$0.0000002 per/run • ~$0.000000208 per/100ms λ λ AWS Lambdas: 20,000 feet view

Slide 33

Slide 33 text

There is a ‘but’ Deploying/Managing them is a pita λ

Slide 34

Slide 34 text

No content

Slide 35

Slide 35 text

• Tool to create, wire and deploy AWS Lambdas using CloudFormation • Python/Javascript/Java/Golang/Scala runtimes… • Supported integrations • APIGateway, CloudWatch, Dynamodb, Kinesis, S3

Slide 36

Slide 36 text

+ snowplow events simulator λ DynamoDB Kinesis

Slide 37

Slide 37 text

We are hiring! Thank you! Software Engineer (data team) Software Engineer (core team) Infrastructure Engineer Data Scientist