TY - RPRT AU - Colella, Fabrizio AU - Lalive, Rafael AU - Sakalli, Seyhun Orcan AU - Thoenig, Mathias TI - Inference with Arbitrary Clustering PY - 2019/Aug/ PB - Institute of Labor Economics (IZA) CY - Bonn T2 - IZA Discussion Paper IS - 12584 UR - https://www.iza.org/publications/dp12584 AB - Analyses of spatial or network data are now very common. Nevertheless, statistical inference is challenging since unobserved heterogeneity can be correlated across neighboring observational units. We develop an estimator for the variance-covariance matrix (VCV) of OLS and 2SLS that allows for arbitrary dependence of the errors across observations in space or network structure and across time periods. As a proof of concept, we conduct Monte Carlo simulations in a geospatial setting based on U.S. metropolitan areas. Tests based on our estimator of the VCV asymptotically correctly reject the null hypothesis, whereas conventional inference methods, e.g., those without clusters or with clusters based on administrative units, reject the null hypothesis too often. We also provide simulations in a network setting based on the IDEAS structure of coauthorship and real-life data on scientific performance. The Monte Carlo results again show that our estimator yields inference at the correct significance level even in moderately sized samples and that it dominates other commonly used approaches to inference in networks. We provide guidance to the applied researcher with respect to (i) whether or not to include potentially correlated regressors and (ii) the choice of cluster bandwidth. Finally, we provide a companion statistical package (acreg) enabling users to adjust the OLS and 2SLS coefficient's standard errors to account for arbitrary dependence. KW - spatial correlation KW - cluster KW - network data KW - clustering KW - arbitrary KW - geospatial data KW - instrumental variables ER -