Layout Optimizer • Layout Optimizer ◦ Optimize the layout of home screen on Mercari app. ◦ “Reward modeling” is needed to predict the performance of each layout in order to select the best one. New predictive model using MLP (Multi-Layer Perceptron, a simple deep neural network model) trained with new dataset achieved higher performance than the previous one. Executive Summary
app can serve various layouts that consist of multiple components. We want to serve the optimal layout that earns the highest engagements. → Layout Optimizer Background: Home Screen ← Retargeting by Like component ← Recommended Keywords component
problem: choose the best one which maximize a reward in several candidates during exploring alternatives that we don’t know the reward distribution. However, the reward based on purchase of any items comes delayed from an action of selecting a layout. So, we cannot guess which layout is the best immediately. → It is necessary to predict the reward in a short time. Select the optimal layout of home screen in mercari app using a bandit algorithm. Agent Select Reward Candidates
logs of users Output: Conversion rate (probabilities from 0 to 1) • Problem settings ◦ Predict the probability that any item viewed via the home screen is purchased with action logs in one hour. Predict the reward as it comes late from selection of layout by LO.
conventional method データ&モデル • Model ◦ Logistic regression ▪ Linear model, binary classification. • Data ◦ Features ▪ 35 kinds of events from client side logs. ▪ Binary features which represent each event has occurred or not. ◦ Labels ▪ Binary label which means any item viewed via home is purchased.
in this experiment. • MLP ◦ Multi Layer Perceptron ◦ A fundamental neural network architecture which consists of linear layers and activation functions. • XGBoost ◦ Decision tree based algorithm which ensembles decision trees using boosting algorithm.
models under the below conditions and evaluated their performance. • Dataset ◦ Baseline ◦ New Data ◦ Period: 2022/7/18 - 2022/7/25 • Criteria ◦ AUC of ROC curve ◦ AUC of PR curve
new dataset got the higher results than the baseline. MLP marked the highest results both ROC AUC and PR AUC. Results ROC AUC ↑ PR AUC ↑ Baseline (Logistic Reg) 0.59 0.05 MLP + New Dataset 0.86 0.22 XGBoost + New Dataset 0.85 0.21
works. I worked on a task to improve a model and data used in layout optimizer which optimizes the layout of home screen in mercari app. New models and data got the higher performance than the baseline. A possible future work is to see if LO using the new reward modeling can improve the criteria such as BCR and GMV by A/B testing.
parameters Num of layers 2 Hidden units 100, 10 Optimizer Adam Epochs 20 Batch size 200 Learning rate 0.001 XGBoost parameters Max depth 3 Num of trees 100 Learning rate 0.3