Alsabhan Waleed, Alfadhly Abdullah
Department of Software Engineering, College of Engineering, Alfaisal University, Riyadh, Saudi Arabia.
King Abdulaziz City for Science and Technology (KACST), Riyadh, Saudi Arabia.
Sci Rep. 2025 Jul 8;15(1):24568. doi: 10.1038/s41598-025-09423-y.
The precise diagnosis of heart disease represents a significant obstacle within the medical field, demanding the implementation of advanced diagnostic instruments and methodologies. This article conducts an extensive examination of the efficacy of different machine learning (ML) and deep learning (DL) models in forecasting heart disease using tabular dataset, with a particular focus on a binary classification task. An extensive array of preprocessing techniques is thoroughly examined in order to optimize the predictive models' quality and performance. Our study employs a wide range of ML algorithms, such as Logistic Regression (LR), Naive Bayes (NB), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), K-Nearest Neibors (KNN), AdaBoost (AB), Gradient Boosting Machine (GBM), Light Gradient Boosting Machine (LGBM), CatBoost (CB), Linear Discriminant Analysis (LDA), and Artificial Neural Network (ANN) to assess the predictive performance of these algorithms in the context of heart disease detection. By subjecting the ML models to exhaustive experimentation, this study evaluates the effects of different feature scaling, namely standardization, minmax scaling, and normalization technique on their performance. The assessment takes into account various parameters including accuracy (Acc), precision (Pre), recall (Rec), F1 score (F1), Area Under Curve (AUC), Cohen's Kappa (CK)and Logloss. The results of this research not only illuminate the optimal scaling methods and ML models for forecasting heart disease, but also offer valuable perspectives on the pragmatic ramifications of implementing these models within a healthcare environment. The research endeavors to make a scholarly contribution to the field of cardiology by utilizing predictive analytics to pave the way for improved early detection and diagnosis of heart disease. This is critical information for coordinating treatment and ensuring opportune intervention.
心脏病的精确诊断是医学领域的一个重大障碍,需要采用先进的诊断仪器和方法。本文广泛研究了不同机器学习(ML)和深度学习(DL)模型在使用表格数据集预测心脏病方面的功效,特别关注二元分类任务。为了优化预测模型的质量和性能,对一系列广泛的预处理技术进行了全面研究。我们的研究采用了多种ML算法,如逻辑回归(LR)、朴素贝叶斯(NB)、支持向量机(SVM)、决策树(DT)、随机森林(RF)、K近邻(KNN)、自适应增强(AB)、梯度提升机(GBM)、轻量级梯度提升机(LGBM)、CatBoost(CB)、线性判别分析(LDA)和人工神经网络(ANN),以评估这些算法在心脏病检测背景下的预测性能。通过对ML模型进行详尽的实验,本研究评估了不同特征缩放(即标准化、最小最大缩放和归一化技术)对其性能的影响。评估考虑了各种参数,包括准确率(Acc)、精确率(Pre)、召回率(Rec)、F1分数(F1)、曲线下面积(AUC)、科恩卡帕系数(CK)和对数损失。本研究结果不仅阐明了预测心脏病的最佳缩放方法和ML模型,还为在医疗环境中实施这些模型的实际影响提供了有价值的观点。该研究致力于通过利用预测分析为改善心脏病的早期检测和诊断铺平道路,从而为心脏病学领域做出学术贡献。这是协调治疗和确保及时干预的关键信息。