Post Launch Evaluation of Policies in a High-Dimensional Setting

Nassiri, Shima; Bayati, Mohsen; Cooprider, Joe

Abstract:A/B tests, also known as randomized controlled experiments (RCTs), are the gold standard for evaluating the impact of new policies, products, or decisions. However, these tests can be costly in terms of time and resources, potentially exposing users, customers, or other test subjects (units) to inferior options. This paper explores practical considerations in applying methodologies inspired by "synthetic control" as an alternative to traditional A/B testing in settings with very large numbers of units, involving up to hundreds of millions of units, which is common in modern applications such as e-commerce and ride-sharing platforms. This method is particularly valuable in settings where the treatment affects only a subset of units, leaving many units unaffected. In these scenarios, synthetic control methods leverage data from unaffected units to estimate counterfactual outcomes for treated units. After the treatment is implemented, these estimates can be compared to actual outcomes to measure the treatment effect. A key challenge in creating accurate counterfactual outcomes is interpolation bias, a well-documented phenomenon that occurs when control units differ significantly from treated units. To address this, we propose a two-phase approach: first using nearest neighbor matching based on unit covariates to select similar control units, then applying supervised learning methods suitable for high-dimensional data to estimate counterfactual outcomes. Testing using six large-scale experiments demonstrates that this approach successfully improves estimate accuracy. However, our analysis reveals that machine learning bias -- which arises from methods that trade off bias for variance reduction -- can impact results and affect conclusions about treatment effects. We document this bias in large-scale experimental settings and propose effective de-biasing techniques to address this challenge.

Comments:	15 pages, 2 figures, 6 tables
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG); Applications (stat.AP); Methodology (stat.ME)
Cite as:	arXiv:2501.00119 [stat.ML]
	(or arXiv:2501.00119v1 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2501.00119

Statistics > Machine Learning

Title:Post Launch Evaluation of Policies in a High-Dimensional Setting

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators