Suppr超能文献

一种通过Optuna超参数调整套件对XGBoost机器学习模型进行的杂交,用于心血管疾病分类,对异常值和异构训练数据集有显著影响。

A hybridization of XGBoost machine learning model by Optuna hyperparameter tuning suite for cardiovascular disease classification with significant effect of outliers and heterogeneous training datasets.

作者信息

Dhanka Sanjay, Maini Surita

机构信息

Department of Electrical and Instrumentation Engineering,Sant Longowal Institute of Engineering and Technology, Longowal, Sangrur, Punjab, India.

出版信息

Int J Cardiol. 2025 Feb 1;420:132757. doi: 10.1016/j.ijcard.2024.132757. Epub 2024 Nov 28.

Abstract

BACKGROUND

Over the last few decades: heart disease (HD) has emerged as one of the deadliest diseases in the world. Approximately more than 31 % of the population dies from HD each year. The Diagnosis of HD in an earlier stage is a cognitively challenging task due to the vast and complex availability of medical datasets. Many tests are available for the diagnosis of HD, such as ECG, etc.; but the proper diagnosis of the disease is still a great challenge.

METHODS

Motivated by existing challenges and the significance of HD, the authors developed a novel hybrid XGBoost Classifier framework for HD prediction that incorporates outlier removal and optimized hyperparameter tuning. In this approach, outliers were handled using z-score and interquartile range (IQR) methods, and hyperparameters were optimized using the "Optuna" framework. Additionally, the impact of different train-test ratios (70,30, 80:20, and 90:10) on model performance was evaluated using Cleveland HD dataset, both with and without outliers.

RESULTS

The proposed hybrid model achieved the finest performance metrics without outliers on a 90:10 train-test ratio with an accuracy of 95.45 %, sensitivity of 92.86 %, precision of 100 %, specificity of 100 %, f1-score 96.3 %, training time 0.8 × 10 s and testing time 0.1 × 10 s. It was validated by Stratify K-Fold Cross-Validation.

CONCLUSIONS

This study highlights the importance of data preprocessing, appropriate train-test ratios, and hyperparameter optimization in HD prediction. The proposed framework provides a promising solution for accurate and efficient HD diagnosis, offering potential benefits for cardiac patient healthcare and decision-making.

摘要

背景

在过去几十年中,心脏病已成为世界上最致命的疾病之一。每年约有超过31%的人口死于心脏病。由于医学数据集庞大且复杂,早期诊断心脏病是一项具有认知挑战性的任务。有许多测试可用于心脏病的诊断,如心电图等;但对该疾病的准确诊断仍然是一个巨大的挑战。

方法

受现有挑战和心脏病的重要性的推动,作者开发了一种用于心脏病预测的新型混合XGBoost分类器框架,该框架结合了异常值去除和优化的超参数调整。在这种方法中,使用z分数和四分位距(IQR)方法处理异常值,并使用“Optuna”框架优化超参数。此外,使用克利夫兰心脏病数据集评估了不同训练-测试比例(70:30、80:20和90:10)对模型性能的影响,包括有和没有异常值的情况。

结果

所提出的混合模型在90:10的训练-测试比例下,在无异常值的情况下实现了最佳性能指标,准确率为95.45%,灵敏度为92.86%,精确率为100%,特异性为100%,F1分数为96.3%,训练时间为0.8×10秒,测试时间为0.1×10秒。通过分层K折交叉验证进行了验证。

结论

本研究强调了数据预处理、适当的训练-测试比例和超参数优化在心脏病预测中的重要性。所提出的框架为准确、高效的心脏病诊断提供了一个有前景的解决方案,为心脏病患者的医疗保健和决策提供了潜在的益处。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验