基于机器学习范式的心脏问题诊断混合通用框架。
A Hybrid Generic Framework for Heart Problem Diagnosis Based on a Machine Learning Paradigm.
机构信息
Information Systems Department, College of Computer and Information Science, King Saud University, Riyadh 11543, Saudi Arabia.
Department of Informatics, Modeling, Electronics, and Systems, University of Calabria, 87036 Rende, Italy.
出版信息
Sensors (Basel). 2023 Jan 26;23(3):1392. doi: 10.3390/s23031392.
The early, valid prediction of heart problems would minimize life threats and save lives, while lack of prediction and false diagnosis can be fatal. Addressing a single dataset alone to build a machine learning model for the identification of heart problems is not practical because each country and hospital has its own data schema, structure, and quality. On this basis, a generic framework has been built for heart problem diagnosis. This framework is a hybrid framework that employs multiple machine learning and deep learning techniques and votes for the best outcome based on a novel voting technique with the intention to remove bias from the model. The framework contains two consequent layers. The first layer contains simultaneous machine learning models running over a given dataset. The second layer consolidates the outputs of the first layer and classifies them as a second classification layer based on novel voting techniques. Prior to the classification process, the framework selects the top features using a proposed feature selection framework. It starts by filtering the columns using multiple feature selection methods and considers the top common features selected. Results from the proposed framework, with 95.6% accuracy, show its superiority over the single machine learning model, classical stacking technique, and traditional voting technique. The main contribution of this work is to demonstrate how the prediction probabilities of multiple models can be exploited for the purpose of creating another layer for final output; this step neutralizes any model bias. Another experimental contribution is proving the complete pipeline's ability to be retrained and used for other datasets collected using different measurements and with different distributions.
早期准确地预测心脏问题可以将生命威胁降到最低并拯救生命,而预测不足和误诊可能是致命的。仅针对单一数据集构建用于识别心脏问题的机器学习模型是不切实际的,因为每个国家和医院都有自己的数据模式、结构和质量。在此基础上,已经为心脏问题诊断构建了一个通用框架。该框架是一种混合框架,采用了多种机器学习和深度学习技术,并根据一种新颖的投票技术对最佳结果进行投票,旨在消除模型中的偏差。该框架包含两个连续的层。第一层包含同时运行在给定数据集上的机器学习模型。第二层整合第一层的输出,并根据新颖的投票技术将其分类为第二层分类。在分类过程之前,该框架使用提出的特征选择框架选择顶级特征。它首先使用多种特征选择方法过滤列,并考虑选择的顶级公共特征。该框架的准确率为 95.6%,其结果优于单一机器学习模型、经典堆叠技术和传统投票技术,展示了其优越性。这项工作的主要贡献是展示如何利用多个模型的预测概率来创建另一个用于最终输出的层;这一步消除了任何模型偏差。另一个实验贡献是证明了完整管道能够进行再训练并用于使用不同测量方法和不同分布收集的其他数据集。
相似文献
Sensors (Basel). 2023-1-26
BMC Bioinformatics. 2023-4-11
Comput Methods Programs Biomed. 2022-6
Comput Biol Med. 2021-8
引用本文的文献
BMC Med Inform Decis Mak. 2023-5-23
本文引用的文献
BMC Med Inform Decis Mak. 2020-2-3
Nat Rev Rheumatol. 2020-2
IEEE J Biomed Health Inform. 2014-11