Three Key Checklists and Remedies for Trustworthy Analysis of Online Controlled Experiments at Scale

Three Key Checklists and Remedies for Trustworthy Analysis of Online Controlled Experiments at Scale

41st International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP '19) | May 2019

Online Controlled Experiments (OCEs) are transforming the decision-making process of data-driven companies into an experimental laboratory. Despite their great power in identifying what customers actually value, experimentation is very sensitive to data loss, skipped checks, wrong designs, and many other ‘hiccups’ in the analysis process. For this purpose, experiment analysis has traditionally been done by experienced data analysts and scientists that closely monitored experiments throughout their lifecycle. Depending solely on scarce experts, however, is neither scalable nor bulletproof.

To democratize experimentation, analysis should be streamlined and meticulously performed by engineers, managers, or others responsible for the development of a product. In this paper, based on synthesized experience of companies that run thousands of OCEs per year, we examined how experts inspect online experiments. We reveal that most of the experiment analysis happens before OCEs are even started, and we summarize the key analysis steps in three checklists.

The value of the checklists is threefold. First, they can increase the accuracy of experiment set-up and decision-making process. Second, checklists can enable novice data scientists and software engineers to become more autonomous in setting-up and analyzing experiments. Finally, they can serve as a base to develop trustworthy platforms and tools for OCE set-up and analysis.