Slide 1

Slide 1 text

No content

Slide 2

Slide 2 text

Agenda › Introduction › About Me › What is LINE NEWS › Case Study › AB Testing in LINE NEWS › Post AB Test Analysis

Slide 3

Slide 3 text

About Me Yoshitaka Suzuki - Data Analyst - LINE NEWS & LINE Search

Slide 4

Slide 4 text

What is LINE NEWS News Tab LINE NEWS digest

Slide 5

Slide 5 text

Monthly Users 75 million Monthly Pageviews 175 billion Daily Articles 8,000+ What is LINE NEWS

Slide 6

Slide 6 text

Personalized Contents FOR YOU "FOR YOU" delivers a personalized selection of articles to users, based on their interests and browsing histories.

Slide 7

Slide 7 text

Personalized Contents 20190101 20190201 20190301 20190401 20190501 20190601 20190701 20190801 20190901 20191001 20191101 20191201 20200101 20200201 20200301 20200401 20200501 20200601 20200701 Monthly Clicks on FOR YOU

Slide 8

Slide 8 text

Case Study: AB Testing in LINE NEWS › Pre-design › Verification › Overview

Slide 9

Slide 9 text

Sometimes, AB testing does not go well

Slide 10

Slide 10 text

- What is the purpose of the AB test? - How should the result be interpreted?

Slide 11

Slide 11 text

Overview of AB Test AB Testing Analysing and Post-Action Planning Developing Targets of this session

Slide 12

Slide 12 text

Overview of AB Test AB test for comparing Recommendation Engines - Control Group: Recommend Engine A (existing) - Treatment Group: Recommend Engine B (new)

Slide 13

Slide 13 text

Designing the AB Test - What is the purpose? - How to measure the effect?

Slide 14

Slide 14 text

What is the Purpose of the AB Test? FOR YOU’s Goal - We want users to read more articles - Improve and maximize FOR YOU clicks, impressions, and CTR LINE NEWS’s Goal - We want to increase LINE NEWS' pageviews & ad sales - Potential side effects: - The increase in impressions and clicks on FOR YOU do not exceed the overall decrease in impressions and clicks for non-FOR YOU components - FOR YOU clicks increase, but overall pageviews and/or ad sales decrease

Slide 15

Slide 15 text

How to Measure the Effect of the AB test? Defining KPI 1. FOR YOU Clicks, Impressions and CTR 2. Overall Pageviews per Session 3. Overall News Article Impressions 4. Overall Ad Sales FOR YOU’s KPI LINE NEWS KPI LINE NEWS KPI LINE NEWS KPI

Slide 16

Slide 16 text

Monitoring the AB test - Visualize data with Tableau - Plot both actual values and lift rates

Slide 17

Slide 17 text

AB Test Results 13% Lift 3% Lift 0.7% Lift 2% Lift Lift = (treatment group’s values / control group’s values) - 1 1. FOR YOU Clicks, Impressions and CTR 2. Overall Pageviews per Session 3. Overall News Article Impressions 4. Overall Ad Sales FOR YOU’s KPI LINE NEWS KPI LINE NEWS KPI LINE NEWS KPI

Slide 18

Slide 18 text

But there was a problem

Slide 19

Slide 19 text

Qualitative Reaction to the AB test Sensational articles are recommended through the new recommendation engine… Compared to before, there are more articles that are not suited to my interests

Slide 20

Slide 20 text

AB Test Summary Quantitative Good Qualitative Bad

Slide 21

Slide 21 text

Is this really a problem?

Slide 22

Slide 22 text

Additional Analysis Hypothesis: Some features are too strong in the new recommendation engine. There may be an unintended bias in the recommended articles. Additional analysis: Is there a difference amongst types of article categories exposed in the two groups?

Slide 23

Slide 23 text

Additional Analysis Result - Impression by article category by gender - Impressions of articles in the entertainment category have increased significantly for male Excerpts from article category data

Slide 24

Slide 24 text

Additional Analysis Result Hypothesis: Some features are too strong in the new recommendation engine. There may be an unintended bias in the recommended articles. Conclusion: There is most likely an unintended bias, especially towards men. The exposure frequency of entertainment category articles are clearly higher in the Treatment group.

Slide 25

Slide 25 text

Next Action Improve the recommendation engine and do the AB testing again Examples of further AB tests conducted based on this result include: - Adding features to bring out more categories - Adding features to reflect long-term interests in recommendation results

Slide 26

Slide 26 text

Session Summary Designing the AB Test - Verbalize and visualize the purpose - Aim for overall optimization, not partial optimization After the AB Test - Often does not go as designed - Can be caused by qualitative and ambiguous things - Quantify qualitative feedback - Verbalize with quantitative indicators - Prepare many dimensions for analysis

Slide 27

Slide 27 text

Thank you