Mercan_internship_finalpresentation_nabe-ryo

1 Conﬁdential - Do Not Share Reward Modeling for Layout
Optimizer Ryo Watanabe Marketplace / Recommendation Team

2 Conﬁdential - Do Not Share Internship from 8/1 to
9/30  Recommendation team  ML engineer  Mentor: @shido san  Manager: @umechan san  1st year of Master’s degree  Ryo Watanabe @nabe-ryo 

3 Conﬁdential - Do Not Share Tackled reward modeling for
Layout Optimizer • Layout Optimizer ◦ Optimize the layout of home screen on Mercari app. ◦ “Reward modeling” is needed to predict the performance of each layout in order to select the best one. New predictive model using MLP (Multi-Layer Perceptron, a simple deep neural network model) trained with new dataset achieved higher performance than the previous one. Executive Summary

4 Conﬁdential - Do Not Share Home screen of Mercari
app can serve various layouts that consist of multiple components. We want to serve the optimal layout that earns the highest engagements. → Layout Optimizer Background: Home Screen ← Retargeting by Like component ← Recommended Keywords component

5 Conﬁdential - Do Not Share Background: Layout Optimizer Bandit
problem: choose the best one which maximize a reward in several candidates during exploring alternatives that we don’t know the reward distribution. However, the reward based on purchase of any items comes delayed from an action of selecting a layout. So, we cannot guess which layout is the best immediately. → It is necessary to predict the reward in a short time. Select the optimal layout of home screen in mercari app using a bandit algorithm. Agent Select Reward Candidates

6 Conﬁdential - Do Not Share Reward Modeling Input: Action
logs of users Output: Conversion rate (probabilities from 0 to 1) • Problem settings ◦ Predict the probability that any item viewed via the home screen is purchased with action logs in one hour. Predict the reward as it comes late from selection of layout by LO.

7 Conﬁdential - Do Not Share Model and data of
conventional method データ&モデル • Model ◦ Logistic regression ▪ Linear model, binary classiﬁcation. • Data ◦ Features ▪ 35 kinds of events from client side logs. ▪ Binary features which represent each event has occurred or not. ◦ Labels ▪ Binary label which means any item viewed via home is purchased.

8 Conﬁdential - Do Not Share Model Improvement Models used
in this experiment. • MLP ◦ Multi Layer Perceptron ◦ A fundamental neural network architecture which consists of linear layers and activation functions. • XGBoost ◦ Decision tree based algorithm which ensembles decision trees using boosting algorithm.

9 Conﬁdential - Do Not Share Data Improvement Counted the
number of times each event occurred. Reduce some events which less frequently occurred. Feature modiﬁcation

10 Conﬁdential - Do Not Share Experiments 実験設定 Trained the
models under the below conditions and evaluated their performance. • Dataset ◦ Baseline ◦ New Data ◦ Period: 2022/7/18 - 2022/7/25 • Criteria ◦ AUC of ROC curve ◦ AUC of PR curve

11 Conﬁdential - Do Not Share Improved models trained with
new dataset got the higher results than the baseline. MLP marked the highest results both ROC AUC and PR AUC. Results ROC AUC ↑ PR AUC ↑ Baseline (Logistic Reg) 0.59 0.05 MLP + New Dataset 0.86 0.22 XGBoost + New Dataset 0.85 0.21

12 Conﬁdential - Do Not Share Conclusion Contributions and future
works. I worked on a task to improve a model and data used in layout optimizer which optimizes the layout of home screen in mercari app. New models and data got the higher performance than the baseline. A possible future work is to see if LO using the new reward modeling can improve the criteria such as BCR and GMV by A/B testing.

13 Conﬁdential - Do Not Share Thank you for your
listening

14 Conﬁdential - Do Not Share Appendix

15 Conﬁdential - Do Not Share MLP Details of Models
parameters Num of layers 2 Hidden units 100, 10 Optimizer Adam Epochs 20 Batch size 200 Learning rate 0.001 XGBoost parameters Max depth 3 Num of trees 100 Learning rate 0.3

Mercan_internship_finalpresentation_nabe-ryo

Mercan_internship_finalpresentation_nabe-ryo

mercari PRO

More Decks by mercari

Other Decks in Technology

Featured

Transcript

1 Conﬁdential - Do Not Share Reward Modeling for Layout

2 Conﬁdential - Do Not Share Internship from 8/1 to

3 Conﬁdential - Do Not Share Tackled reward modeling for

4 Conﬁdential - Do Not Share Home screen of Mercari

5 Conﬁdential - Do Not Share Background: Layout Optimizer Bandit

6 Conﬁdential - Do Not Share Reward Modeling Input: Action

7 Conﬁdential - Do Not Share Model and data of

8 Conﬁdential - Do Not Share Model Improvement Models used

9 Conﬁdential - Do Not Share Data Improvement Counted the

10 Conﬁdential - Do Not Share Experiments 実験設定 Trained the

11 Conﬁdential - Do Not Share Improved models trained with

12 Conﬁdential - Do Not Share Conclusion Contributions and future

13 Conﬁdential - Do Not Share Thank you for your

14 Conﬁdential - Do Not Share Appendix

15 Conﬁdential - Do Not Share MLP Details of Models