Sani Jamilu, Halane Salad, Ahmed Abdiwali Mohamed, Ahmed Mohamed Mustaf
Department of Demography and Social Statistics, Federal University Birnin Kebbi, Birnin Kebbi, Kebbi State, Nigeria.
Department of Public Health, Ministry of Health, Galmudug, Somalia.
Sci Rep. 2025 Jul 20;15(1):26301. doi: 10.1038/s41598-025-04704-y.
Fertility preferences significantly influence population dynamics and reproductive health outcomes, particularly in low-resource settings, such as Somalia, where high fertility rates and limited healthcare infrastructure pose significant challenges. Understanding the determinants of fertility preferences is critical for designing targeted interventions. This study leverages machine learning (ML) algorithms and Shapley Additive extensions (SHAP) to identify key predictors of fertility preferences among reproductive-aged women in Somalia. This cross-sectional study utilized data from the 2020 Somalia Demographic and Health Survey (SDHS), encompassing 8,951 women aged 15-49 years. The outcome variable, fertility preference, was dichotomized as either desire for more children or preference to cease childbearing. Predictor variables included sociodemographic factors, such as age, education, parity, wealth, residence, and distance to health facilities. Seven ML algorithms were evaluated for predictive performance, with Random Forest emerging as the optimal model based on metrics such as accuracy, precision, recall, F1-score, and the Area Under the Receiver Operating Characteristic Curve (AUROC). SHAP was employed to interpret the model by quantifying the feature contributions. The SHAP analysis identified the most influential predictors of fertility preferences as age group, region, number of births in the last five years, number of children born, marital status, wealth index, education level, residence, and distance to health facilities. Specifically, age group was the most significant feature, followed by region and number of births in the last five years. Women aged 45-49 years and those with higher parity were significantly more likely to prefer no additional children. Distance to health facilities has emerged as a critical barrier, with better access being associated with a greater likelihood of desiring more children. The Random Forest model demonstrated superior performance, achieving an accuracy of 81%, precision of 78%, recall of 85%, F1-score of 82%, and AUROC of 0.89. SHAP analysis provided interpretable insights, highlighting the nuanced interplay of sociodemographic factors. This study underscores the potential of ML algorithms and SHAP in advancing our understanding of fertility preferences in low-resource settings. By identifying critical sociodemographic determinants, such as age group, region, number of births in the last five years, number of children born, marital status, wealth index, education level, residence, distance to health facilities, and employment status, these findings offer actionable insights to inform evidence-based reproductive health interventions in Somalia. Future research should expand the application of ML to longitudinal data and incorporate additional cultural and psychosocial predictors to enhance the robustness and applicability of this model.
生育偏好显著影响人口动态和生殖健康结果,特别是在资源匮乏地区,如索马里,那里的高生育率和有限的医疗基础设施带来了重大挑战。了解生育偏好的决定因素对于设计有针对性的干预措施至关重要。本研究利用机器学习(ML)算法和夏普利加法扩展(SHAP)来确定索马里育龄妇女中生育偏好的关键预测因素。这项横断面研究利用了2020年索马里人口与健康调查(SDHS)的数据,涵盖了8951名年龄在15至49岁之间的女性。结果变量生育偏好被二分法分为想要更多孩子或倾向于停止生育。预测变量包括社会人口学因素,如年龄、教育程度、生育次数、财富、居住情况以及到医疗机构的距离。评估了七种ML算法的预测性能,基于准确率、精确率、召回率、F1分数和受试者工作特征曲线下面积(AUROC)等指标,随机森林成为最优模型。采用SHAP通过量化特征贡献来解释模型。SHAP分析确定生育偏好最具影响力的预测因素为年龄组、地区、过去五年的生育次数、生育子女数、婚姻状况、财富指数、教育水平、居住情况以及到医疗机构的距离。具体而言,年龄组是最显著的特征,其次是地区和过去五年的生育次数。45至49岁的女性以及生育次数较多的女性更有可能不想要更多孩子。到医疗机构的距离已成为一个关键障碍,更容易获得医疗服务与想要更多孩子的可能性更大相关。随机森林模型表现出卓越的性能,准确率达到81%,精确率为78%,召回率为85%,F1分数为82%,AUROC为0.89。SHAP分析提供了可解释的见解,突出了社会人口学因素之间细微的相互作用。本研究强调了ML算法和SHAP在增进我们对资源匮乏地区生育偏好理解方面的潜力。通过确定关键的社会人口学决定因素,如年龄组、地区、过去五年的生育次数、生育子女数、婚姻状况、财富指数、教育水平、居住情况、到医疗机构的距离以及就业状况,这些发现提供了可采取行动的见解,为索马里基于证据的生殖健康干预提供参考。未来的研究应将ML的应用扩展到纵向数据,并纳入更多文化和心理社会预测因素,以增强该模型的稳健性和适用性。