May 2024

IZA DP No. 17036: Using Cross-Survey Imputation to Estimate Poverty for Venezuelan Refugees in Colombia

Household consumption or income surveys do not typically cover refugee populations. In the rare cases where refuges are included, inconsistencies between different data sources could interfere with comparable poverty estimates. We test the performance of a recently developed cross-survey imputation method to estimate poverty for a sample of refugees in Colombia, combining household income surveys collected by the Government of Colombia and administrative data collected by the United Nations High Commissioner for Refugees. We find that certain variable transformation methods can help resolve these inconsistencies. Estimation results with our preferred variable standardization method are robust to different imputation methods, including the normal linear regression method, the empirical distribution of the errors method, and the probit and logit methods. We also employ several common machine learning techniques such as Random Forest, Lasso, Ridge, and elastic regressions for robustness checks, but these techniques generally perform worse than the imputation methods that we use. We also find that we can reasonably impute poverty rates using an older household income survey and a more recent ProGres dataset for most of the poverty lines. These results provide relevant inputs into designing better surveys and administrate datasets on refugees in various country settings.