Suppr超能文献

精准医疗:深入探讨用于准确预测心脏病的机器学习算法和特征选择策略。

Precision healthcare: A deep dive into machine learning algorithms and feature selection strategies for accurate heart disease prediction.

作者信息

Islam Md Ariful, Majumder Md Ziaul Hasan, Miah Md Sohel, Jannaty Sumaia

机构信息

Department of Robotics and Mechatronics Engineering, University of Dhaka, Dhaka, 1000, Bangladesh.

Institute of Electronics, Bangladesh Atomic Energy Commision, Dhaka, Bangladesh.

出版信息

Comput Biol Med. 2024 Jun;176:108432. doi: 10.1016/j.compbiomed.2024.108432. Epub 2024 May 10.

Abstract

This paper presents a comprehensive exploration of machine learning algorithms (MLAs) and feature selection techniques for accurate heart disease prediction (HDP) in modern healthcare. By focusing on diverse datasets encompassing various challenges, the research sheds light on optimal strategies for early detection. MLAs such as Decision Trees (DT), Random Forests (RF), Support Vector Machines (SVM), Gaussian Naive Bayes (NB), and others were studied, with precision and recall metrics emphasized for robust predictions. Our study addresses challenges in real-world data through data cleaning and one-hot encoding, enhancing the integrity of our predictive models. Feature extraction techniques-Recursive Feature Extraction (RFE), Principal Component Analysis (PCA), and univariate feature selection-play a crucial role in identifying relevant features and reducing data dimensionality. Our findings showcase the impact of these techniques on improving prediction accuracy. Optimized models for each dataset have been achieved through grid search hyperparameter tuning, with configurations meticulously outlined. Notably, a remarkable 99.12 % accuracy was achieved on the first Kaggle dataset, showcasing the potential for accurate HDP. Model robustness across diverse datasets was highlighted, with caution against overfitting. The study emphasizes the need for validation of unseen data and encourages ongoing research for generalizability. Serving as a practical guide, this research aids researchers and practitioners in HDP model development, influencing clinical decisions and healthcare resource allocation. By providing insights into effective algorithms and techniques, the paper contributes to reducing heart disease-related morbidity and mortality, supporting the healthcare community's ongoing efforts.

摘要

本文全面探讨了机器学习算法(MLA)和特征选择技术,以在现代医疗保健中实现准确的心脏病预测(HDP)。通过关注包含各种挑战的不同数据集,该研究揭示了早期检测的最佳策略。研究了决策树(DT)、随机森林(RF)、支持向量机(SVM)、高斯朴素贝叶斯(NB)等机器学习算法,并强调精确率和召回率指标以进行稳健预测。我们的研究通过数据清理和独热编码解决现实世界数据中的挑战,增强了预测模型的完整性。特征提取技术——递归特征消除(RFE)、主成分分析(PCA)和单变量特征选择——在识别相关特征和降低数据维度方面发挥着关键作用。我们的研究结果展示了这些技术对提高预测准确性的影响。通过网格搜索超参数调整,为每个数据集实现了优化模型,并精心概述了配置。值得注意的是,在第一个Kaggle数据集上实现了高达99.12%的准确率,展示了准确进行心脏病预测的潜力。强调了模型在不同数据集上的稳健性,并谨防过拟合。该研究强调了对未见数据进行验证的必要性,并鼓励进行关于可推广性的持续研究。作为一份实用指南,本研究有助于心脏病预测模型开发方面的研究人员和从业者,影响临床决策和医疗资源分配。通过提供对有效算法和技术的见解,本文有助于降低与心脏病相关的发病率和死亡率,支持医疗界的持续努力。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验