Suppr超能文献

基于机器学习的先天性心脏病预测模型的开发:一项匹配病例对照研究。

Development of machine learning-based models to predict congenital heart disease: A matched case-control study.

作者信息

Zhang Shutong, Kang Chenxi, Cui Jing, Xue Haodan, Zhao Shanshan, Chen Yukui, Lu Haixia, Ye Lu, Wang Duolao, Chen Fangyao, Zhao Yaling, Pei Leilei, Qu Pengfei

机构信息

Department of Epidemiology and Health Statistics, School of Public Health, Xi'an Jiaotong University Health Science Center, Xi'an, Shaanxi 710061, China.

Shaanxi Eye Hospital, Xi'an People's Hospital (Xi'an Fourth Hospital), Xi'an, China.

出版信息

Int J Med Inform. 2025 Mar;195:105741. doi: 10.1016/j.ijmedinf.2024.105741. Epub 2024 Dec 2.

Abstract

BACKGROUND

The current congenital heart disease (CHD) prediction tools lack adequate interpretability and convenience, hindering the development of personalized CHD management strategies. We developed a machine learning-based risk stratification model for CHD prediction.

METHODS

This study utilized data from 1,759 participants in a case-control study of CHD conducted across six birth defects surveillance hospitals located in Xi'an, Shaanxi Province, Northwest China, spanning from January 2014 to December 2016. The data was partitioned into training and testing datasets with a ratio of 7:3. Predictors were selected from a total of 47 input variables through the Least Absolute Shrinkage and Selection Operator (LASSO). Five machine learning algorithms were used to build the CHD risk prediction models. Model performance was assessed based on a range of learning metrics, including the area under the receiver operating characteristic curve (AUROC), F1 score, and Brier score. Permutation feature importance was employed to elucidate the prediction model. The best-performing model was used to conduct the risk scores.

RESULTS

The eXtreme Gradient Boosting (XGB) model demonstrated superior performance among CHD prediction models, achieving an AUROC of 0.772 (95 % CI 0.728, 0.817) in the testing dataset and 0.738 (0.699, 0.775) in the external validation dataset. The pivotal predictors (top 3) identified by the model included living in rural areas, the low wealth index, and folic acid supplements (<90 days). The resultant risk score exhibited robust calibration capabilities. Utilizing the risk scores, participants were stratified into low, moderate, and high-risk categories, signifying substantial variations in CHD risk.

CONCLUSION

This study underscores the feasibility and efficacy of employing a machine learning-based approach for CHD prediction. The risk scores exhibited potential in identifying pregnant women at high risk for fetal CHD, offering valuable insights for guiding primary prevention and CHD management.

摘要

背景

当前的先天性心脏病(CHD)预测工具缺乏足够的可解释性和便利性,阻碍了个性化CHD管理策略的发展。我们开发了一种基于机器学习的CHD预测风险分层模型。

方法

本研究利用了2014年1月至2016年12月在中国西北陕西省西安市的六家出生缺陷监测医院进行的一项CHD病例对照研究中1759名参与者的数据。数据以7:3的比例分为训练和测试数据集。通过最小绝对收缩和选择算子(LASSO)从总共47个输入变量中选择预测因子。使用五种机器学习算法构建CHD风险预测模型。基于一系列学习指标评估模型性能,包括受试者工作特征曲线下面积(AUROC)、F1分数和布里尔分数。采用排列特征重要性来阐明预测模型。使用表现最佳的模型进行风险评分。

结果

极端梯度提升(XGB)模型在CHD预测模型中表现出卓越性能,在测试数据集中AUROC为0.772(95%CI 0.728,0.817),在外部验证数据集中为0.738(0.699,0.775)。该模型确定的关键预测因子(前3个)包括居住在农村地区、低财富指数和叶酸补充剂(<90天)。所得风险评分显示出强大的校准能力。利用风险评分,参与者被分为低、中、高风险类别,表明CHD风险存在显著差异。

结论

本研究强调了采用基于机器学习的方法进行CHD预测的可行性和有效性。风险评分在识别胎儿CHD高危孕妇方面具有潜力,为指导一级预防和CHD管理提供了有价值的见解。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验