TY - RPRT AU - Kemper, Jan AU - Rostam-Afschar, Davud TI - Earning While Learning: How to Run Batched Bandit Experiments PY - 2026/Feb/ PB - Institute of Labor Economics (IZA) CY - Bonn T2 - IZA Discussion Paper IS - 18429 UR - https://www.iza.org/publications/dp18429 AB - Researchers typically collect experimental data sequentially, allowing early outcome observations and adaptive treatment assignment to reduce exposure to inferior treatments. This article reviews multi-armed-bandit adaptive experimental designs that balance exploration and exploitation. Because adaptively collected experimental data through bandit algorithms violate standard asymptotics, inference is challenging. We implement an estimator that yields valid heteroskedasticity-robust confidence intervals in batched bandit designs and compare coverage in Monte Carlo simulations. We introduce bbandits for Stata, a tool for designing experiments via simulation, running interactive bandit experiments, and implementing and analyzing adaptively collected data. bbandits includes three common assignment algorithms—ε-first, ε-greedy, and Thompson sampling—and supports estimation, inference, and visualization. KW - randomized controlled trial KW - causal inference KW - multi-armed bandits KW - experimental design KW - machine learning ER -