on customer and product relations • Easy to iterate over entities and its relations, through Gremlin DSL • Simple way to calculate common customer behaviours • No use of complex matrix calculations • Cassandra + Titan + Rexster + Python
increasing response time • Recommendations were calculated directly on the graph • High computational cost when doing several traversals on graph at once (more on that later) • Supernodes • Hard to add tags or non-graph attributes (multiples email addresses referring to a single customer) without increasing the graph size significantly • Events collected server side • Implemented a tracker (1x1 pixel) and a async pipeline
(using Java) • Implemented our pixel tag to directly collect information from browser • Our own analytics system was also developed • Recommendations being calculated outside the graph
(using Java) • Implemented our pixel tag to directly collect information from browser • Our own analytics system was also developed • Recommendations being calculated outside the graph
(using Java) • Implemented our pixel tag to directly collect information from browser • Our own analytics system was also developed • Recommendations being calculated outside the graph
Hard to maintain the code • Problem accessing disk too many times for personalized recommendations • Email and push notifications API Down on most important times
Scale and isolate these behaviours: we can serve recommendations even if we are not collecting events. • Small and simpler codebase, refactoring won't affect the overall system too much, deploys are not huge switches • Faster to add new features and try new algorithms • Better application profiling
recommendations are stored on elasticsearch, easy to rebuild and query. • Recommendation time drop (personalised): from 400ms to 50ms - that matters in terms of conversion and customer bail-out. • Less traversals on the graph, less overall load in the system Microservices