Upgrade to Pro — share decks privately, control downloads, hide ads and more …

PDCA to improve the LINE NEWS recommendation engine

PDCA to improve the LINE NEWS recommendation engine

Eebedc2ee7ff95ffb9d9102c6d4a065c?s=128

LINE DevDay 2020

November 27, 2020
Tweet

Transcript

  1. None
  2. Agenda › Introduction › About Me › What is LINE

    NEWS › Case Study › AB Testing in LINE NEWS › Post AB Test Analysis
  3. About Me Yoshitaka Suzuki - Data Analyst - LINE NEWS

    & LINE Search
  4. What is LINE NEWS News Tab LINE NEWS digest

  5. Monthly Users 75 million Monthly Pageviews 175 billion Daily Articles

    8,000+ What is LINE NEWS
  6. Personalized Contents FOR YOU "FOR YOU" delivers a personalized selection

    of articles to users, based on their interests and browsing histories.
  7. Personalized Contents 20190101 20190201 20190301 20190401 20190501 20190601 20190701 20190801

    20190901 20191001 20191101 20191201 20200101 20200201 20200301 20200401 20200501 20200601 20200701 Monthly Clicks on FOR YOU
  8. Case Study: AB Testing in LINE NEWS › Pre-design ›

    Verification › Overview
  9. Sometimes, AB testing does not go well

  10. - What is the purpose of the AB test? -

    How should the result be interpreted?
  11. Overview of AB Test AB Testing Analysing and Post-Action Planning

    Developing Targets of this session
  12. Overview of AB Test AB test for comparing Recommendation Engines

    - Control Group: Recommend Engine A (existing) - Treatment Group: Recommend Engine B (new)
  13. Designing the AB Test - What is the purpose? -

    How to measure the effect?
  14. What is the Purpose of the AB Test? FOR YOU’s

    Goal - We want users to read more articles - Improve and maximize FOR YOU clicks, impressions, and CTR LINE NEWS’s Goal - We want to increase LINE NEWS' pageviews & ad sales - Potential side effects: - The increase in impressions and clicks on FOR YOU do not exceed the overall decrease in impressions and clicks for non-FOR YOU components - FOR YOU clicks increase, but overall pageviews and/or ad sales decrease
  15. How to Measure the Effect of the AB test? Defining

    KPI 1. FOR YOU Clicks, Impressions and CTR 2. Overall Pageviews per Session 3. Overall News Article Impressions 4. Overall Ad Sales FOR YOU’s KPI LINE NEWS KPI LINE NEWS KPI LINE NEWS KPI
  16. Monitoring the AB test      

                                           - Visualize data with Tableau - Plot both actual values and lift rates
  17. AB Test Results 13% Lift 3% Lift 0.7% Lift 2%

    Lift Lift = (treatment group’s values / control group’s values) - 1 1. FOR YOU Clicks, Impressions and CTR 2. Overall Pageviews per Session 3. Overall News Article Impressions 4. Overall Ad Sales FOR YOU’s KPI LINE NEWS KPI LINE NEWS KPI LINE NEWS KPI
  18. But there was a problem

  19. Qualitative Reaction to the AB test Sensational articles are recommended

    through the new recommendation engine… Compared to before, there are more articles that are not suited to my interests
  20. AB Test Summary Quantitative Good Qualitative Bad

  21. Is this really a problem?

  22. Additional Analysis Hypothesis: Some features are too strong in the

    new recommendation engine. There may be an unintended bias in the recommended articles. Additional analysis: Is there a difference amongst types of article categories exposed in the two groups?
  23. Additional Analysis Result - Impression by article category by gender

    - Impressions of articles in the entertainment category have increased significantly for male Excerpts from article category data
  24. Additional Analysis Result Hypothesis: Some features are too strong in

    the new recommendation engine. There may be an unintended bias in the recommended articles. Conclusion: There is most likely an unintended bias, especially towards men. The exposure frequency of entertainment category articles are clearly higher in the Treatment group.
  25. Next Action Improve the recommendation engine and do the AB

    testing again Examples of further AB tests conducted based on this result include: - Adding features to bring out more categories - Adding features to reflect long-term interests in recommendation results
  26. Session Summary Designing the AB Test - Verbalize and visualize

    the purpose - Aim for overall optimization, not partial optimization After the AB Test - Often does not go as designed - Can be caused by qualitative and ambiguous things - Quantify qualitative feedback - Verbalize with quantitative indicators - Prepare many dimensions for analysis
  27. Thank you