Best Practices and Pitfalls in Evaluating Results of Online Controlled Experiments Treatment Collect Control Collect Randomly Assign to Two Variants User Come to the Experiment New Feature X Compare Metrics and Determine the Winner AB tests are the best scientific way to prove causality!
Collect Randomly Assign Into Two Variants User Coming Into Experiment New Feature X Compare Metrics and Determine the Winner Daily 100s Million Active Users Countries, OSs, App Versions Lots of New Features Large-Scale Log Data Complex Metrics
Collect Randomly Assign Into Two Variants User Coming Into Experiment New Feature X Compare Metrics and Determine the Winner Daily 100s Million Active Users Countries, OSs, App Versions Lots of New Features Large-Scale Log Data Complex Metrics Human Errors Ad-Hoc Data Processing Ad-Hoc Test Dashboard Ad-Hoc Metrics
Assign Into Two Variants User Coming Into Experiment New Feature X Compare Metrics and Determine the Winner Daily 100s Million Active Users Countries, OSs, App Versions Lots of New Features Large-Scale Log Data Complex Metrics
app’s configurations Libra > Manage the result of AB test design Libra Report > Manage Test Metrics > Generate Test Dashboards > https://line.github.io/centraldogma/ > DevDay 2018, LINE AB Test Standardization with Our Own Toolset
> Logics Dashboard DB > Create Dynamic DAGs API Dashboard User DB > AB Test Configurations > Key Metric Definitions (Hive SQL, Rscript) Data Metadata Orchestration
AB tests! > Through AB Tests, we can ensure LINE services make user value! > Data scientists are expensive to hire. They can do more with proper tools!