We study linear regressions in a context where the outcome of interest and some of the covariates are observed in two different datasets that cannot be matched. Traditional approaches obtain point identification by relying, often implicitly, on exclusion restrictions. We show that without such restrictions, coefficients of interest can still be partially identified, with the sharp bounds taking a simple form. We obtain tighter bounds when variables observed in both datasets, but not included in the regression of interest, are available, even if these variables are not subject to specific restrictions.
We develop computationally simple and asymptotically normal estimators of the bounds. Finally, we apply our methodology to estimate racial disparities in patent approval rates and to evaluate the effect of patience and risk-taking on educational performance.
We use cookies to provide you with an optimal website experience. This includes cookies that are necessary for the operation of the site as well as cookies that are only used for anonymous statistical purposes, for comfort settings or to display personalized content. You can decide for yourself which categories you want to allow. Please note that based on your settings, you may not be able to use all of the site's functions.
Cookie settings
These necessary cookies are required to activate the core functionality of the website. An opt-out from these technologies is not available.
In order to further improve our offer and our website, we collect anonymous data for statistics and analyses. With the help of these cookies we can, for example, determine the number of visitors and the effect of certain pages on our website and optimize our content.