Suppr超能文献

用于心脏病预测的SGO增强随机森林和极端梯度提升框架。

SGO enhanced random forest and extreme gradient boosting framework for heart disease prediction.

作者信息

Naik Anima, Tejani Ghanshyam G, Mousavirad Seyed Jalaleddin

机构信息

Department of CSE, Raghu Engineering College, Visakhapatnam, Andhra Pradesh, 530003, India.

Department of Research Analytics, Saveetha Dental College and Hospitals, Saveetha Institute of Medical and Technical Sciences, Saveetha University, Chennai, 600077, India.

出版信息

Sci Rep. 2025 May 25;15(1):18145. doi: 10.1038/s41598-025-02525-7.

Abstract

Cardiovascular disease (CVD) remains a leading global health concern, accounting for approximately 31.5% of deaths worldwide. According to the World Health Organization (WHO), over 20.5 million people succumb to CVD each year-a figure projected to rise to 24.2 million by 2030. Early diagnosis is critical and can be facilitated by monitoring key risk factors such as cholesterol levels, blood pressure, diabetes, and obesity. This study proposes a heart disease prediction (HDP) model employing Random Forest (RF) and eXtreme Gradient Boosting (XGB) classifiers. Both models are further optimized through hyperparameter tuning using the Social Group Optimization (SGO) algorithm. The model was developed and validated using the Cleveland and Statlog datasets from the UCI repository. Pre-optimization results for RF yielded an accuracy (Acc.) of 84% and a ROC-AUC score of 92.03% on the Cleveland dataset, and 88.09% Acc. with a ROC-AUC of 97.50% on Statlog. The XGB classifier achieved 81.97% Acc. and a ROC-AUC of 90.73% on Cleveland, and 92.86% Acc. with a ROC-AUC of 96.14% on Statlog. After SGO-based optimization, RF improved to 95.08% Acc. and 95.26% ROC-AUC on Cleveland, and 95.24% Acc. with 98.18% ROC-AUC on Statlog. Similarly, the optimized XGB classifier reached 93.44% Acc. and 95.24% ROC-AUC on Cleveland, and 97.62% Acc. with 97.50% ROC-AUC on Statlog. These results highlight the effectiveness of SGO in enhancing ML performance for medical prediction problems. However, the study has certain limitations. The evaluation was conducted solely on two benchmark datasets, which may not fully reflect the diversity and complexity of real-world clinical populations. Furthermore, external validation using independent or real-time clinical data was not performed, which may limit the generalizability of the results. The computational cost associated with SGO optimization was also not assessed. Future research should focus on validating the model across broader datasets, assessing real-world applicability, and analyzing computational efficiency to ensure scalability and clinical adoption.

摘要

心血管疾病(CVD)仍然是全球主要的健康问题,约占全球死亡人数的31.5%。根据世界卫生组织(WHO)的数据,每年有超过2050万人死于心血管疾病,预计到2030年这一数字将升至2420万。早期诊断至关重要,监测胆固醇水平、血压、糖尿病和肥胖等关键风险因素有助于实现早期诊断。本研究提出了一种采用随机森林(RF)和极端梯度提升(XGB)分类器的心脏病预测(HDP)模型。通过使用社会群体优化(SGO)算法进行超参数调整,对这两种模型进行了进一步优化。该模型使用UCI库中的克利夫兰和Statlog数据集进行开发和验证。RF在克利夫兰数据集上的预优化结果为准确率(Acc.)84%,ROC-AUC得分为92.03%,在Statlog数据集上的准确率为88.09%,ROC-AUC为97.50%。XGB分类器在克利夫兰数据集上的准确率为81.97%,ROC-AUC为90.73%,在Statlog数据集上的准确率为92.86%,ROC-AUC为96.14%。基于SGO优化后,RF在克利夫兰数据集上的准确率提高到95.08%,ROC-AUC为95.26%,在Statlog数据集上的准确率为95.24%,ROC-AUC为98.18%。同样,优化后的XGB分类器在克利夫兰数据集上的准确率达到93.44%,ROC-AUC为95.24%,在Statlog数据集上的准确率为97.62%,ROC-AUC为97.50%。这些结果突出了SGO在提高医学预测问题的机器学习性能方面的有效性。然而,该研究有一定的局限性。评估仅在两个基准数据集上进行,可能无法完全反映现实世界临床人群的多样性和复杂性。此外,未使用独立或实时临床数据进行外部验证,这可能会限制结果的普遍性。与SGO优化相关的计算成本也未进行评估。未来的研究应侧重于在更广泛的数据集上验证模型,评估实际应用适用性,并分析计算效率,以确保可扩展性和临床应用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c180/12104338/7cddaa18e0dc/41598_2025_2525_Figa_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验