Silva Gabriel Ferreira Dos Santos, Wichmann Roberta Moreira, da Silva Junior Francisco Costa, Chiavegatto Filho Alexandre Dias Porto
School of Public Health, University of São Paulo, Av. Dr. Arnaldo, 715 - Cerqueira César, São Paulo, 01246-904, SP, Brazil.
Economics Graduate Program, IDP - Brazilian Institute of Education, Development and Research, Brasilia, DF, Brazil.
Sci Rep. 2025 Jul 7;15(1):24278. doi: 10.1038/s41598-025-04066-5.
Neonatal mortality poses a critical challenge in global health, particularly in low- and middle-income countries. Leveraging advancements in technology, such as machine learning (ML) algorithms, offers the potential to improve neonatal care by enabling precise prediction and prevention of mortality risks. This study utilized the Maternal and Neonatal Health Registry (MNHR) dataset from the National Institutes of Health (NIH), encompassing multicentric neonatal data across various countries, to evaluate the effectiveness of ML in predicting neonatal mortality risk. We compared three training approaches: a generalized model applicable across all countries, country-specific models tailored to local healthcare characteristics, and a model derived from the largest single-country dataset. Utilizing data from 2010 to 2016 for training and validation from 2017 to 2019, our analysis included 575,664 pregnancies and assessed five ML algorithms based on key neonatal health indicators recommended by the World Health Organization. Notably, the generalized model demonstrated the highest predictive performance, achieving an Area Under the Receiver Operating Characteristic Curve (AUC-ROC) of 0.816, highlighting the benefits of leveraging a diverse dataset. Our findings advocate for the integration of generalized ML models into healthcare strategies to improve neonatal health outcomes and emphasize the importance of data diversity in reducing neonatal mortality rates.
新生儿死亡率是全球卫生领域面临的一项严峻挑战,在低收入和中等收入国家尤为如此。利用机器学习(ML)算法等技术进步,通过实现对死亡风险的精确预测和预防,有望改善新生儿护理。本研究利用美国国立卫生研究院(NIH)的孕产妇和新生儿健康登记处(MNHR)数据集,该数据集涵盖了不同国家的多中心新生儿数据,以评估ML在预测新生儿死亡风险方面的有效性。我们比较了三种训练方法:适用于所有国家的通用模型、根据当地医疗保健特点定制的特定国家模型,以及从最大的单一国家数据集衍生出的模型。利用2010年至2016年的数据进行训练,并利用2017年至2019年的数据进行验证,我们的分析纳入了575,664例妊娠,并根据世界卫生组织推荐的关键新生儿健康指标评估了五种ML算法。值得注意的是,通用模型表现出最高的预测性能,受试者工作特征曲线下面积(AUC-ROC)达到0.816,突出了利用多样化数据集的好处。我们的研究结果主张将通用ML模型纳入医疗保健策略,以改善新生儿健康结局,并强调数据多样性在降低新生儿死亡率方面的重要性。