Dhanka Sanjay, Maini Surita
Department of Electrical and Instrumentation Engineering,Sant Longowal Institute of Engineering and Technology, Longowal, Sangrur, Punjab, India.
Int J Cardiol. 2025 Feb 1;420:132757. doi: 10.1016/j.ijcard.2024.132757. Epub 2024 Nov 28.
Over the last few decades: heart disease (HD) has emerged as one of the deadliest diseases in the world. Approximately more than 31 % of the population dies from HD each year. The Diagnosis of HD in an earlier stage is a cognitively challenging task due to the vast and complex availability of medical datasets. Many tests are available for the diagnosis of HD, such as ECG, etc.; but the proper diagnosis of the disease is still a great challenge.
Motivated by existing challenges and the significance of HD, the authors developed a novel hybrid XGBoost Classifier framework for HD prediction that incorporates outlier removal and optimized hyperparameter tuning. In this approach, outliers were handled using z-score and interquartile range (IQR) methods, and hyperparameters were optimized using the "Optuna" framework. Additionally, the impact of different train-test ratios (70,30, 80:20, and 90:10) on model performance was evaluated using Cleveland HD dataset, both with and without outliers.
The proposed hybrid model achieved the finest performance metrics without outliers on a 90:10 train-test ratio with an accuracy of 95.45 %, sensitivity of 92.86 %, precision of 100 %, specificity of 100 %, f1-score 96.3 %, training time 0.8 × 10 s and testing time 0.1 × 10 s. It was validated by Stratify K-Fold Cross-Validation.
This study highlights the importance of data preprocessing, appropriate train-test ratios, and hyperparameter optimization in HD prediction. The proposed framework provides a promising solution for accurate and efficient HD diagnosis, offering potential benefits for cardiac patient healthcare and decision-making.
在过去几十年中,心脏病已成为世界上最致命的疾病之一。每年约有超过31%的人口死于心脏病。由于医学数据集庞大且复杂,早期诊断心脏病是一项具有认知挑战性的任务。有许多测试可用于心脏病的诊断,如心电图等;但对该疾病的准确诊断仍然是一个巨大的挑战。
受现有挑战和心脏病的重要性的推动,作者开发了一种用于心脏病预测的新型混合XGBoost分类器框架,该框架结合了异常值去除和优化的超参数调整。在这种方法中,使用z分数和四分位距(IQR)方法处理异常值,并使用“Optuna”框架优化超参数。此外,使用克利夫兰心脏病数据集评估了不同训练-测试比例(70:30、80:20和90:10)对模型性能的影响,包括有和没有异常值的情况。
所提出的混合模型在90:10的训练-测试比例下,在无异常值的情况下实现了最佳性能指标,准确率为95.45%,灵敏度为92.86%,精确率为100%,特异性为100%,F1分数为96.3%,训练时间为0.8×10秒,测试时间为0.1×10秒。通过分层K折交叉验证进行了验证。
本研究强调了数据预处理、适当的训练-测试比例和超参数优化在心脏病预测中的重要性。所提出的框架为准确、高效的心脏病诊断提供了一个有前景的解决方案,为心脏病患者的医疗保健和决策提供了潜在的益处。