It is standard practice in applied work to rely on linear least squares regression to estimate the effect of a binary variable ("treatment") on some outcome of interest. In this paper I study the interpretation of the regression estimand when treatment effects are in fact heterogeneous.
I show that the coefficient on treatment is identical to the outcome of the following three-step procedure: first, calculate the linear projection of treatment on the vector of other covariates ("propensity score"); second, calculate average partial effects for both groups of interest ("treated" and "controls") from a regression of outcome on treatment, the propensity score, and their interaction; third, calculate a weighted average of these two effects, with weights being inversely related to the unconditional probability that a unit belongs to a given group. Each of these steps is potentially problematic, but this last property – the reliance on implicit weights which are inversely related to the proportion of each group – can have particularly severe consequences for applied work.
To illustrate the importance of this result, I perform Monte Carlo simulations as well as replicate two applied papers: Berger, Easterly, Nunn and Satyanath (2013) on the effects of successful CIA interventions during the Cold War on imports from the US; and Martinez-Bravo (2014) on the effects of appointed officials on village-level electoral results in Indonesia. In both cases some of the conclusions change dramatically after allowing for heterogeneity in effects.