Ferro de Mello Maria Eduarda, da Silva Rocha Élisson, Endo Patricia Takako
Programa de Pós-Graduação em Engenharia da Computação, Universidade de Pernambuco, Recife, Pernambuco, Brazil.
BMC Pregnancy Childbirth. 2025 Sep 1;25(1):906. doi: 10.1186/s12884-025-08028-7.
This study aims to evaluate the performance of machine learning models using different data imputation techniques in different balancing scenarios, employing sociodemographic attributes and maternal health history, using data of a population from the state of Pernambuco, Brazil, to predict fetal death during pregnancy.
We used a dataset from a social program in Pernambuco, Brazil, covering the period from 2008 to 2022, that includes sociodemographic, prenatal, maternal and family health history data. We separated two scenarios with two balancing techniques to train the models, Random Undersampling (RU scenario) and Hybrid Undersampling 2x (H2X scenario) and we explored using four tree-based machine learning models, each of which was evaluated based on their performance and feature importance.
The models were evaluated under different metrics. The XGBoost model stood out with 81.06% specificity and the Random Forest model stood out with 67.73% sensitivity, in different scenarios. The attributes that most impacted the learning process were first prenatal care, age, education and interpregnancy interval.
This application is particularly valuable in the context of social projects, such as those in Brazil, where innovative solutions can contribute to achieving the SDGs offering a unique perspective on the intersection of technology, healthcare, and social impact.
本研究旨在评估在不同的平衡场景下,使用不同数据插补技术的机器学习模型的性能,这些模型采用社会人口学属性和孕产妇健康史,利用巴西伯南布哥州人群的数据来预测孕期胎儿死亡情况。
我们使用了巴西伯南布哥州一个社会项目的数据集,该数据集涵盖2008年至2022年期间,包括社会人口学、产前、孕产妇和家庭健康史数据。我们采用两种平衡技术将数据分为两种场景来训练模型,即随机欠采样(RU场景)和混合欠采样2倍(H2X场景),并探索使用四种基于树的机器学习模型,每个模型都根据其性能和特征重要性进行评估。
在不同指标下对模型进行了评估。在不同场景中,XGBoost模型的特异性为81.06%,表现突出;随机森林模型的灵敏度为67.73%,表现突出。对学习过程影响最大的属性是首次产前检查、年龄、教育程度和两次妊娠间隔。
在社会项目的背景下,如巴西的那些项目,这种应用特别有价值,在这些项目中,创新解决方案有助于实现可持续发展目标,为技术、医疗保健和社会影响的交叉点提供独特视角。