June 2024

IZA DP No. 17080: Breastfeeding and Child Development Outcomes across Early Childhood and Adolescence: Doubly Robust Estimation with Machine Learning

Md Mohsan Khudri, Andrew Hussey

Using data from the Panel Study of Income Dynamics, we estimate the impact of breastfeeding initiation and duration on multiple cognitive, health, and behavioral outcomes spanning early childhood through adolescence. To mitigate the potential bias from misspecification, we employ a doubly robust (DR) estimation method, addressing misspecification in either the treatment or outcome models while adjusting for selection effects. Our novel approach is to use and evaluate a battery of supervised machine learning (ML) algorithms to improve propensity score (PS) estimates. We demonstrate that the gradient boosting machine (GBM) algorithm removes bias more effectively and minimizes other prediction errors compared to logit and probit models as well as alternative ML algorithms. Across all outcomes, our DR-GBM estimation generally yields lower estimates than OLS, DR, and PS matching using standard and alternative ML algorithms and even sibling fixed effects estimates. We find that having been breastfed is significantly linked to multiple improved early cognitive outcomes, though the impact reduces somewhat with age. In contrast, we find mixed evidence regarding the impact of breastfeeding on non-cognitive (health and behavioral) outcomes, with effects being most pronounced in adolescence. Our results also suggest relatively higher cognitive benefits for children of minority mothers and children of mothers with at least some post-high school education, and minimal marginal benefits of breastfeeding duration beyond 12 months for cognitive outcomes and 6 months for non-cognitive outcomes.