Suppr超能文献

利用新生儿脐带血CpG位点的甲基化变化,基于机器学习预测早产风险。

Machine learning-based prediction of preterm birth risk using methylation changes in neonatal cord blood CpG sites.

作者信息

Feng Yuxin, Ni Ying, Wang Wenkai, Guo Fen, Wang Liyu, Zhu Fan, Zhang Luyao, Feng Ying

机构信息

Department of Oncology, the Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou, Jiangsu Province, 215000, China.

Gusu School, Nanjing Medical University, Suzhou, Jiangsu Province, 215000, China.

出版信息

BMC Pregnancy Childbirth. 2025 Jul 22;25(1):784. doi: 10.1186/s12884-025-07884-7.

Abstract

BACKGROUND

Preterm birth, defined as delivery before 37 weeks of gestation, is a major cause of neonatal morbidity and mortality. DNA methylation changes at CpG sites have been associated with the risk of preterm birth.

OBJECTIVE

This study aimed to identify differential CpG sites in cord blood and develop predictive machine learning models based on these methylation changes to assess preterm birth risk.

METHODS

Methylome data from 110 neonatal cord blood samples in the GSE110828 dataset were analyzed to identify CpG sites differing between preterm and full-term births (88 for training, and 22 for testing, respectively). Key CpG sites were selected using Lasso, Elastic Net, and Random Forest. Forty-five predictive models were constructed and evaluated for accuracy, precision, recall, and F1 score.

RESULTS

Sixty-six CpG sites showed significant differences between preterm and full-term groups. Four models, including Random Forest with Lasso and Gradient Boosting with Random Forest, achieved optimal predictive performance, each with a validation accuracy of 93.75%.

CONCLUSION

DNA methylation changes at CpG sites in cord blood are associated with preterm birth risk. CpG-based methylation models demonstrate high predictive accuracy and hold promise for early clinical risk assessment.

摘要

背景

早产定义为妊娠37周前分娩,是新生儿发病和死亡的主要原因。CpG位点的DNA甲基化变化与早产风险相关。

目的

本研究旨在识别脐带血中差异CpG位点,并基于这些甲基化变化开发预测性机器学习模型,以评估早产风险。

方法

分析GSE110828数据集中110份新生儿脐带血样本的甲基化组数据,以识别早产和足月产之间不同的CpG位点(分别有88份用于训练,22份用于测试)。使用套索回归、弹性网络和随机森林选择关键CpG位点。构建了45个预测模型,并对其准确性、精确性、召回率和F1分数进行评估。

结果

66个CpG位点在早产组和足月组之间存在显著差异。四个模型,包括带套索回归的随机森林模型和带随机森林的梯度提升模型,实现了最佳预测性能,每个模型的验证准确率均为93.75%。

结论

脐带血中CpG位点的DNA甲基化变化与早产风险相关。基于CpG的甲基化模型显示出高预测准确性,有望用于早期临床风险评估。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验