Building a smart recommender system across LINE services

2019 DevDay Building a Smart Recommender System Across LINE Services
> Jun Namikawa > LINE Machine Learning Team Fellow

Introduction

LINE Services

> Display recommended content and advertisements at the top of
the chat tab Smart Channel

Overview Concept of Smart Channel Feed Contents Personalize

History of Smart Channel Country: JP

> Day2: C2-1 12:00-12:40 "LINE-Like" Product Management > Poster Session
13:40-14:20/15:30-16:10 (2days) > Day1: B1-2 14:30-15:10 The Art of Smart Channel Continuous Improvements in Smart Channel Platform/Contents Related Sections

ML Architecture

Recommender System for Smart Channel Constraints Cooperation with existing recommender
systems Cold start problem Scalability

Many Recommender Systems Exist in LINE Each system has a
different > Implementation > Algorithm > Objective

Smart Channel 2019-10 (Global) Current Stats Impressions / Day 500M
Contents / Day 60K+ Global DAU 100M+

Only New Content Has Value

Recommender System Architecture Recommende r System   for Service Recommender
System   for Service Recommender System   for Service Recommender System   for Service Recommended  Items (Candidates) Ranker Trainer Events  (imp, click, etc) LINE App User ID Items Model   parameter Item Request Top k items   for each user

Ranker Item A 0.7 Current Expected   Score 0.4 Current
Expected   Score 0.6 Current Expected   Score 0.1 Current Expected   Score Item B Item C Item D > Ranker chooses an item from candidates A, B, C … by using contextual bandits > Each expected score is computed by a prediction model corresponding to the item

Prediction Model > Imp: 0.5, Click: 1.0, Mute: 0.0 >
Balance Exploration-Exploitation Tradeoff > Laplace Approximation Bayesian Factorization Machine (FM) as an Arm of Contextual Bandits Output User ID Item ID User Features  (Gender, Age, …) Other Features  (Timestamp, …) Bayesian FM Embedding Embedding

Parameter Server for Distributed ML Events LINE App Trainer Worker
Model Worker Model Parameter Server Ranker Executor Model Executor Model Δw W W Request Contents

Example of asynchronous communications between the parameter server and trainers.
In the situation, learning doesn't work well just by accumulating the gradient in the parameter server. Asynchronous Distributed Online Learning

Asynchronous distributed learning algorithm Example of asynchronous communications between the
parameter server and trainers. In the situation, learning doesn't work well just by accumulating the gradient in the parameter server. Asynchronous Distributed Online Learning Deceleration Backtrack

Storage for Parameters Item  Embedding Parameter Server User  Embedding Trainer
Bayesian FM Events

Platform for Data Analysis

Primary Performance Metric > Consistent with user satisfaction trends obtained
from questionnaire research > Easy to calculate > Stable under temporary fluctuations due to user's unfamiliarity Why score is used as main indicator?

Primary Performance Metric > Consistent with user satisfaction trends obtained
from questionnaire research > Easy to calculate > Stable under temporary fluctuations due to user's unfamiliarity Why score is used as main indicator? Release new types of contents, or expand target users

Dashboard Country: JP

Anomaly Detection Country: JP

Offline Test Off-policy Evaluation We use the More Robust Doubly
Robust (MRDR) algorithm to estimate the performance of a new logic from the data generated by other logics. Framework of Offline Test To Evaluate New Logic Offline Test Environment Parameter server and trainers are clones of the production system. We use the event logs stored in DataLake by using PySpark. Trainer Parameter Server (Offline) Ranker DataLake

A/B Test Country: JP

Experiments

Recent Experiments To Improve Recommendation Successful Experiments Incorporate Images in
Banner User and Item embeddings LinUCB to Bayesian FM

LinUCB To Bayesian FM CTR +4.8% Score +5.8% -1.0% xCTR
> Linearity: Easy To Parallelize LinUCB > Explicit Feature Interactions Bayesian FM

Incorporate Images in Banner

Incorporate Images in Banner CTR +56% Score +16% xCTR +35%

User and Item Embeddings 16 User ID Item ID User
Features  (Gender, Age, …) Other Features  (Timestamp, …) Bayesian FM Embedding Embedding

User and Item Embeddings CTR +5.1% Score +25.3% xCTR -16.2%

Future Work

Synergies Between Online and Offline Learning Systems Feed Contents Personalize

Improve Machine Learning Platform Country: JP GPUs on Kubernetes Unified
Hadoop Cluster

Thank You

Building a smart recommender system across LINE...

Building a smart recommender system across LINE services

More Decks by LINE DevDay 2019

Other Decks in Technology

Featured

Transcript