Slide 1

Slide 1 text

Building a recommender from a big behavior graph over Cassandra Arthur Grava Technical Lead - Big Data @arthur_grava [email protected] Gleicon Moraes Director of Data Engineering https://luc.id [email protected]

Slide 2

Slide 2 text

• 786 stores • 8 distribution centers • +18k employees • +40 million clients • 16 million unique visitors / monthly

Slide 3

Slide 3 text

Recommender Systems

Slide 4

Slide 4 text

Proof of Concept

Slide 5

Slide 5 text

POC

Slide 6

Slide 6 text

POC

Slide 7

Slide 7 text

POC

Slide 8

Slide 8 text

POC

Slide 9

Slide 9 text

POC

Slide 10

Slide 10 text

POC

Slide 11

Slide 11 text

POC - Environment

Slide 12

Slide 12 text

Why Graph Database ? • Intuitive schema modeling • Abstraction on customer and product relations • Easy to iterate over entities and its relations, through Gremlin DSL • Simple way to calculate common customer behaviours • No use of complex matrix calculations • Cassandra + Titan + Rexster + Python

Slide 13

Slide 13 text

Schema

Slide 14

Slide 14 text

Graph Language

Slide 15

Slide 15 text

Graph Language

Slide 16

Slide 16 text

Graph Language

Slide 17

Slide 17 text

Graph Language

Slide 18

Slide 18 text

Graph Language

Slide 19

Slide 19 text

Graph Language SKU COUNT tv_2 1 tv_3 2 tv_4 3 tv_5 1

Slide 20

Slide 20 text

Graph Language SKU COUNT tv_4 3 tv_3 2 tv_2 1 tv_5 1

Slide 21

Slide 21 text

Results • Running in AB test with the current solution using wvav recommendations 30% increase on sales

Slide 22

Slide 22 text

Limitations • Unnecessary python layer • This layer was significantly increasing response time • Recommendations were calculated directly on the graph • High computational cost when doing several traversals on graph at once (more on that later) • Supernodes • Hard to add tags or non-graph attributes (multiples email addresses referring to a single customer) without increasing the graph size significantly • Events collected server side • Implemented a tracker (1x1 pixel) and a async pipeline

Slide 23

Slide 23 text

Growing up

Slide 24

Slide 24 text

Production • No more proxies, direct access to the Graph (using Java) • Implemented our pixel tag to directly collect information from browser • Our own analytics system was also developed • Recommendations being calculated outside the graph

Slide 25

Slide 25 text

Production • No more proxies, direct access to the Graph (using Java) • Implemented our pixel tag to directly collect information from browser • Our own analytics system was also developed • Recommendations being calculated outside the graph

Slide 26

Slide 26 text

Production • No more proxies, direct access to the Graph (using Java) • Implemented our pixel tag to directly collect information from browser • Our own analytics system was also developed • Recommendations being calculated outside the graph

Slide 27

Slide 27 text

Production environment

Slide 28

Slide 28 text

• Half average response time • Reduced load on cassandra, that enabled • User centric recommendations • Emails and push notifications Results 25% share on emails 23.8% of recommendations as push notifications

Slide 29

Slide 29 text

Problems • Too many responsibilities for the API module • Hard to maintain the code • Problem accessing disk too many times for personalized recommendations • Email and push notifications API Down on most important times

Slide 30

Slide 30 text

Problems - data layout on cassandra tables

Slide 31

Slide 31 text

Scaling up for good

Slide 32

Slide 32 text

Microservices • Concern separations: event collection and serving recommendations • Scale and isolate these behaviours: we can serve recommendations even if we are not collecting events. • Small and simpler codebase, refactoring won't affect the overall system too much, deploys are not huge switches • Faster to add new features and try new algorithms • Better application profiling

Slide 33

Slide 33 text

• Moving recommendations from Cassandra to Elasticsearch • Pre calculated recommendations are stored on elasticsearch, easy to rebuild and query. • Recommendation time drop (personalised): from 400ms to 50ms - that matters in terms of conversion and customer bail-out. • Less traversals on the graph, less overall load in the system Microservices

Slide 34

Slide 34 text

Business impact From 6% to 15% share on sales 35% share on email sales 31.6% of recommendations as push notifications

Slide 35

Slide 35 text

Microservices environment

Slide 36

Slide 36 text

Questions?

Slide 37

Slide 37 text

No content