Department of Computer Science and Engineering, Indian Institute of Technology (Indian School of Mines), Dhanbad, Jharkhand, 826004, India.
Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Hyderabad, 500075, India.
Sci Rep. 2024 Apr 3;14(1):7833. doi: 10.1038/s41598-024-56931-4.
Heart disease is a major global cause of mortality and a major public health problem for a large number of individuals. A major issue raised by regular clinical data analysis is the recognition of cardiovascular illnesses, including heart attacks and coronary artery disease, even though early identification of heart disease can save many lives. Accurate forecasting and decision assistance may be achieved in an effective manner with machine learning (ML). Big Data, or the vast amounts of data generated by the health sector, may assist models used to make diagnostic choices by revealing hidden information or intricate patterns. This paper uses a hybrid deep learning algorithm to describe a large data analysis and visualization approach for heart disease detection. The proposed approach is intended for use with big data systems, such as Apache Hadoop. An extensive medical data collection is first subjected to an improved k-means clustering (IKC) method to remove outliers, and the remaining class distribution is then balanced using the synthetic minority over-sampling technique (SMOTE). The next step is to forecast the disease using a bio-inspired hybrid mutation-based swarm intelligence (HMSI) with an attention-based gated recurrent unit network (AttGRU) model after recursive feature elimination (RFE) has determined which features are most important. In our implementation, we compare four machine learning algorithms: SAE + ANN (sparse autoencoder + artificial neural network), LR (logistic regression), KNN (K-nearest neighbour), and naïve Bayes. The experiment results indicate that a 95.42% accuracy rate for the hybrid model's suggested heart disease prediction is attained, which effectively outperforms and overcomes the prescribed research gap in mentioned related work.
心脏病是全球主要的死亡原因之一,也是许多人面临的主要公共卫生问题。定期临床数据分析提出的一个主要问题是识别心血管疾病,包括心脏病发作和冠心病,尽管早期发现心脏病可以挽救许多生命。机器学习 (ML) 可以有效地实现准确的预测和决策辅助。大数据,或者医疗部门生成的大量数据,可以通过揭示隐藏信息或复杂模式来帮助做出诊断选择的模型。本文使用混合深度学习算法来描述一种用于心脏病检测的大数据分析和可视化方法。所提出的方法旨在与大数据系统(如 Apache Hadoop)配合使用。首先,对大量医疗数据进行改进的 k-均值聚类 (IKC) 方法处理,以去除异常值,然后使用合成少数过采样技术 (SMOTE) 平衡剩余的类分布。下一步是使用基于生物启发的混合突变基于群体智能 (HMSI) 与基于注意力的门控循环单元网络 (AttGRU) 模型进行疾病预测,递归特征消除 (RFE) 确定哪些特征最重要。在我们的实现中,我们比较了四种机器学习算法:SAE+ANN(稀疏自动编码器+人工神经网络)、LR(逻辑回归)、KNN(K 最近邻)和朴素贝叶斯。实验结果表明,所提出的混合模型对心脏病的预测准确率达到了 95.42%,这有效地超越并克服了相关工作中提到的既定研究差距。