Libra Report: Scaling A/B Tests for Real Services

Libra Report: Scaling A/B Tests for Real Services

Kwangyeol Ryu
LINE Data Science Team1 Data Engineer
https://linedevday.linecorp.com/jp/2019/sessions/S1-05

Be4518b119b8eb017625e0ead20f8fe7?s=128

LINE DevDay 2019

November 20, 2019
Tweet

Transcript

  1. 2019 DevDay Libra Report: Scaling A/B Tests for Real Services

    > Kwangyeol Ryu > LINE Data Science Team1 Data Engineer
  2. Who Am I Kwangyeol Ryu | Data Engineer Building a

    fast data analytics platform to increase Data Scientists’ productivity Data Science Team1 | Data Labs, LINE corporation
  3. Agenda > How we do AB Tests at LINE >

    Libra Report > Summary
  4. AB Tests in One Slide > KDD 2019 Tutorial, Challenges,

    Best Practices and Pitfalls in Evaluating Results of Online Controlled Experiments Treatment Collect Control Collect Randomly Assign to Two Variants User Come to the Experiment New Feature X Compare Metrics and Determine the Winner AB tests are the best scientific way to prove causality!
  5. AB Tests at LINE for Real Services Treatment Collect Control

    Collect Randomly Assign Into Two Variants User Coming Into Experiment New Feature X Compare Metrics and Determine the Winner Daily 100s Million Active Users Countries, OSs, App Versions Lots of New Features Large-Scale 
 Log Data Complex Metrics
  6. AB Tests at LINE for Real Services Treatment Collect Control

    Collect Randomly Assign Into Two Variants User Coming Into Experiment New Feature X Compare Metrics and Determine the Winner Daily 100s Million Active Users Countries, OSs, App Versions Lots of New Features Large-Scale 
 Log Data Complex Metrics Human Errors Ad-Hoc Data Processing Ad-Hoc Test Dashboard Ad-Hoc Metrics
  7. Issues From Data Scientists > Test conditions were delivered via

    wikis, emails, slack channels, etc. > The codes are hard to reuse and share. > Data scientists put lots of effort into controlling the overall test.
  8. AB Tests Systems at LINE Treatment Collect Control Collect Randomly

    Assign Into Two Variants User Coming Into Experiment New Feature X Compare Metrics and Determine the Winner Daily 100s Million Active Users Countries, OSs, App Versions Lots of New Features Large-Scale 
 Log Data Complex Metrics
  9. AB Tests Systems at LINE Central Dogma > Update user

    app’s configurations Libra > Manage the result of AB test design Libra Report > Manage Test Metrics > Generate Test Dashboards > https://line.github.io/centraldogma/ > DevDay 2018, LINE AB Test Standardization with Our Own Toolset
  10. How Libra Report Works Click Stream Service Logs > Conditions

    > Logics Dashboard DB > Create Dynamic DAGs API Dashboard User DB > AB Test Configurations > Key Metric Definitions (Hive SQL, Rscript) Data Metadata Orchestration
  11. Libra Report: Metadata

  12. Libra Report: Orchestration Apache Airflow > Airflow Dynamic DAG •

    Generated from metadata > Manage Complex Data Dependancies > Ensure Filter-out Privacy data • Check user’s privacy policy agreement > Calculate Various Metrics > Calculate Basic Statistics by default • p-value, lift, etc > Easy to backfill when logic changes
  13. Libra Report: Dashboard

  14. Summary > Using AB Tests Systems we can do more

    AB tests! > Through AB Tests, we can ensure LINE services make user value! > Data scientists are expensive to hire. They can do more with proper tools!
  15. Thank You