• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评估不同的平衡数据技术如何使用机器学习模型对早产预测产生影响。

Evaluating how different balancing data techniques impact on prediction of premature birth using machine learning models.

作者信息

Silva Anna Beatriz, Rocha Elisson da Silva, Lorenzato João Fausto, Endo Patricia Takako

机构信息

Universidade de Pernambuco, Pernambuco, Brazil.

出版信息

PLoS One. 2025 Apr 2;20(3):e0316574. doi: 10.1371/journal.pone.0316574. eCollection 2025.

DOI:10.1371/journal.pone.0316574
PMID:40173408
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11964454/
Abstract

Premature birth can be defined as birth before 37 weeks of gestation, which is a significant global health issue, being the main cause for neonatal deaths. In this work, we evaluate machine learning models for predicting premature birth using Brazilian sociodemographic and obstetric data, focusing on the challenge of data imbalance, a common problem that can lead to biased predictions. We evaluate five data balancing techniques: Undersampling, Oversampling, and three Hybridsampling configurations where the minority class was increased by factors 2, 3, and 4. The machine learning models, including Decision Tree, Random Forest, and AdaBoost, are trained and evaluated on a dataset of over 483,000 cases. The use of the Hybridsampling approach resulted in an accuracy of 70%, a recall of 64%, and a precision of 74% in the Decision Tree model. Results show that Hybridsampling techniques significantly improves models' performance compared to Undersampling and Oversampling, highlighting the importance of a proper data balancing in predictive models for preterm birth. The relevance of our work is particularly significant for the Brazilian Unified Health System (SUS). By improving the accuracy of premature birth predictions, our models could assist healthcare providers in identifying at-risk pregnancies earlier, allowing for timely interventions. This integration could enhance maternal and neonatal care, reduce the incidence of preterm births, and potentially decrease neonatal mortality, especially in underserved regions.

摘要

早产可定义为妊娠37周前出生,这是一个重大的全球健康问题,是新生儿死亡的主要原因。在这项工作中,我们使用巴西的社会人口统计学和产科数据评估用于预测早产的机器学习模型,重点关注数据不平衡这一挑战,这是一个可能导致预测有偏差的常见问题。我们评估了五种数据平衡技术:欠采样、过采样以及三种混合采样配置,其中少数类分别增加了2倍、3倍和4倍。包括决策树、随机森林和AdaBoost在内的机器学习模型在一个超过48.3万个病例的数据集上进行训练和评估。在决策树模型中,使用混合采样方法的准确率为70%,召回率为64%,精确率为74%。结果表明,与欠采样和过采样相比,混合采样技术显著提高了模型的性能,凸显了在早产预测模型中进行适当数据平衡的重要性。我们的工作对于巴西统一卫生系统(SUS)尤为重要。通过提高早产预测的准确性,我们的模型可以帮助医疗保健提供者更早地识别有风险的妊娠,从而进行及时干预。这种整合可以加强孕产妇和新生儿护理,降低早产发生率,并有可能降低新生儿死亡率,特别是在服务不足的地区。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c03/11964454/1c2b48ea7033/pone.0316574.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c03/11964454/5cab625d1ba7/pone.0316574.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c03/11964454/0786b255f554/pone.0316574.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c03/11964454/babc204cdcc8/pone.0316574.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c03/11964454/daf42bccae25/pone.0316574.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c03/11964454/dda0cf6f440b/pone.0316574.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c03/11964454/85a5340d3c6a/pone.0316574.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c03/11964454/6d60125af5f1/pone.0316574.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c03/11964454/1c2b48ea7033/pone.0316574.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c03/11964454/5cab625d1ba7/pone.0316574.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c03/11964454/0786b255f554/pone.0316574.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c03/11964454/babc204cdcc8/pone.0316574.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c03/11964454/daf42bccae25/pone.0316574.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c03/11964454/dda0cf6f440b/pone.0316574.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c03/11964454/85a5340d3c6a/pone.0316574.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c03/11964454/6d60125af5f1/pone.0316574.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c03/11964454/1c2b48ea7033/pone.0316574.g008.jpg

相似文献

1
Evaluating how different balancing data techniques impact on prediction of premature birth using machine learning models.评估不同的平衡数据技术如何使用机器学习模型对早产预测产生影响。
PLoS One. 2025 Apr 2;20(3):e0316574. doi: 10.1371/journal.pone.0316574. eCollection 2025.
2
Analysis of big data for prediction of provider-initiated preterm birth and spontaneous premature deliveries and ranking the predictive features.分析大数据以预测提供者发起的早产和自发性早产,并对预测特征进行排名。
Arch Gynecol Obstet. 2019 Dec;300(6):1565-1582. doi: 10.1007/s00404-019-05325-3. Epub 2019 Oct 24.
3
Predicting preterm birth using machine learning methods.使用机器学习方法预测早产。
Sci Rep. 2025 Feb 16;15(1):5683. doi: 10.1038/s41598-025-89905-1.
4
Improving preterm newborn identification in low-resource settings with machine learning.利用机器学习提高资源匮乏环境中早产儿的识别率。
PLoS One. 2019 Feb 27;14(2):e0198919. doi: 10.1371/journal.pone.0198919. eCollection 2019.
5
Predicting adverse birth outcome among childbearing women in Sub-Saharan Africa: employing innovative machine learning techniques.预测撒哈拉以南非洲育龄妇女的不良生育结局:运用创新的机器学习技术。
BMC Public Health. 2024 Jul 29;24(1):2029. doi: 10.1186/s12889-024-19566-8.
6
Issue of Data Imbalance on Low Birthweight Baby Outcomes Prediction and Associated Risk Factors Identification: Establishment of Benchmarking Key Machine Learning Models With Data Rebalancing Strategies.低出生体重婴儿结局预测中数据不平衡问题及相关危险因素识别:应用数据再平衡策略建立基准机器学习模型。
J Med Internet Res. 2023 May 31;25:e44081. doi: 10.2196/44081.
7
Predicting low birth weight risks in pregnant women in Brazil using machine learning algorithms: data from the Araraquara cohort study.使用机器学习算法预测巴西孕妇的低出生体重风险:来自阿拉拉夸拉队列研究的数据。
BMC Pregnancy Childbirth. 2025 Mar 19;25(1):320. doi: 10.1186/s12884-025-07351-3.
8
Prediction and feature selection of low birth weight using machine learning algorithms.利用机器学习算法预测和选择低出生体重。
J Health Popul Nutr. 2024 Oct 12;43(1):157. doi: 10.1186/s41043-024-00647-8.
9
Interpretable machine learning to predict adverse perinatal outcomes: examining marginal predictive value of risk factors during pregnancy.可解释机器学习预测不良围产结局:在妊娠期间检验危险因素的边际预测值。
Am J Obstet Gynecol MFM. 2023 Oct;5(10):101096. doi: 10.1016/j.ajogmf.2023.101096. Epub 2023 Jul 15.
10
Estimating risk of severe neonatal morbidity in preterm births under 32 weeks of gestation.估算妊娠32周以下早产新生儿发生严重疾病的风险。
J Matern Fetal Neonatal Med. 2020 Jan;33(1):73-80. doi: 10.1080/14767058.2018.1487395. Epub 2018 Jul 18.

本文引用的文献

1
Data leakage inflates prediction performance in connectome-based machine learning models.数据泄露会夸大基于连接组学的机器学习模型的预测性能。
Nat Commun. 2024 Feb 28;15(1):1829. doi: 10.1038/s41467-024-46150-w.
2
Establishment of a model for predicting preterm birth based on the machine learning algorithm.基于机器学习算法的早产预测模型的建立。
BMC Pregnancy Childbirth. 2023 Nov 10;23(1):779. doi: 10.1186/s12884-023-06058-7.
3
National, regional, and global estimates of preterm birth in 2020, with trends from 2010: a systematic analysis.
2020 年全球、区域和国家早产估计数及其 2010 年以来的变化趋势:系统分析。
Lancet. 2023 Oct 7;402(10409):1261-1271. doi: 10.1016/S0140-6736(23)00878-4.
4
A hybrid sampling algorithm combining synthetic minority over-sampling technique and edited nearest neighbor for missed abortion diagnosis.一种结合合成少数过采样技术和编辑最近邻的混合采样算法,用于诊断漏诊的流产。
BMC Med Inform Decis Mak. 2022 Dec 29;22(1):344. doi: 10.1186/s12911-022-02075-2.
5
Prediction of preterm birth using artificial intelligence: a systematic review.使用人工智能预测早产:系统评价。
J Obstet Gynaecol. 2022 Aug;42(6):1662-1668. doi: 10.1080/01443615.2022.2056828. Epub 2022 Jun 1.
6
Prediction of preterm birth in nulliparous women using logistic regression and machine learning.应用逻辑回归和机器学习预测初产妇的早产。
PLoS One. 2021 Jun 30;16(6):e0252025. doi: 10.1371/journal.pone.0252025. eCollection 2021.
7
Application of Artificial Intelligence in Early Diagnosis of Spontaneous Preterm Labor and Birth.人工智能在自发性早产和分娩早期诊断中的应用
Diagnostics (Basel). 2020 Sep 22;10(9):733. doi: 10.3390/diagnostics10090733.
8
Effect of maternal age on the risk of preterm birth: A large cohort study.母亲年龄对早产风险的影响:一项大型队列研究。
PLoS One. 2018 Jan 31;13(1):e0191002. doi: 10.1371/journal.pone.0191002. eCollection 2018.
9
Random Forest.随机森林
J Insur Med. 2017;47(1):31-39. doi: 10.17849/insm-47-01-31-39.1.
10
Advanced maternal age increases the risk of very preterm birth, irrespective of parity: a population-based register study.高龄产妇增加了极早产的风险,与产次无关:一项基于人群的登记研究。
BJOG. 2017 Jul;124(8):1235-1244. doi: 10.1111/1471-0528.14368. Epub 2016 Oct 21.