Being a Data Scientist is not only about learning how to use Hadoop, SciPy, NumPy, etc, but also understanding what conclusions we can drive from the data. One of the most important job of a data scientist is to be able to answer if an idea works or not, and for this , understanding the design and use of randomize control experiments is key
To innovate we need to know what ideas work and what ideas do not work, and In order to do this , we need to establish causality. The Gold standard to prove casualty are randomize control experiment, but this is not enough... we also need to have a culture of experimentation, and in order to have this we first need to understand that when an idea fails is not an error.
Controlled experiments, also called randomized experiments and A/B tests, have had a profound influence on multiple fields, including medicine, agriculture, manufacturing, and advertising. Through randomization and proper design, experiments allow establishing causality scientifically, which is why they are the gold standard in drug tests. In software development, multiple techniques are used to define product requirements; controlled experiments provide a valuable way to assess the impact of new features on customer behavior.(Online Experimentation at Microsoft, Kohavi et al)
Web and the Internet allows a perfect place to run control experiments. We are going to review not only the need of randomize control experiments or A/B Testing , but the design, lessons learned , and the cultural challenges of organizations to become data driven.