基于不平衡心电图信号的心脏病分类:使用集成机器学习技术
Imbalanced ECG signal-based heart disease classification using ensemble machine learning technique.
作者信息
Rath Adyasha, Mishra Debahuti, Panda Ganapati
机构信息
Department of Computer Science and Engineering, Siksha O Anusandhan (Deemed to be) University, Bhubaneswar, Odisha, India.
Department of Electronics and Tele Communication, C. V. Raman Global University, Bhubaneswar, Odisha, India.
出版信息
Front Big Data. 2022 Oct 10;5:1021518. doi: 10.3389/fdata.2022.1021518. eCollection 2022.
The machine learning (ML)-based classification models are widely utilized for the automated detection of heart diseases (HDs) using various physiological signals such as electrocardiogram (ECG), magnetocardiography (MCG), heart sound (HS), and impedance cardiography (ICG) signals. However, ECG-based HD identification is the most common one used by clinicians. In the current investigation, the ECG records or subjects have been sampled and are used as inputs to the classification model to distinguish between normal and abnormal patients. The study has employed an imbalanced number of ECG samples for training the various classification models. Few ML methods such as support vector machine (SVM), logistic regression (LR), and adaptive boosting (AdaBoost) which have been rarely used for HD detection have been selected. The performance of the developed model has been evaluated in terms of accuracy, F1-score, and area under curve (AUC) values using ECG signals of subjects given in publicly available (PTB-ECG, MIT-BIH) datasets. Ranking of the models has been assigned based on these performance metrics and it is found that the AdaBoost and LR classifiers stand in first and second positions. These two models have been ensembled based on the majority voting principle and the performance measure of this ensemble model has also been determined. It is, in general, observed that the proposed ensemble model demonstrates the best HD detection performance of 0.946, 0.949, and 0.951 for the PTB-ECG dataset and 0.921, 0.926, and 0.950 for the MIT-BIH dataset in terms of accuracy, F1-score, and AUC, respectively. The proposed methodology can also be employed for the classification of HD using ICG, MCG, and HS signals as inputs. Further, the proposed methodology can also be applied to the detection of other diseases.
基于机器学习(ML)的分类模型被广泛用于利用各种生理信号(如心电图(ECG)、心磁图(MCG)、心音(HS)和阻抗心动图(ICG)信号)自动检测心脏病(HD)。然而,基于心电图的心脏病识别是临床医生最常用的方法。在当前的研究中,心电图记录或受试者已被采样,并用作分类模型的输入,以区分正常患者和异常患者。该研究采用了数量不均衡的心电图样本训练各种分类模型。选择了很少用于心脏病检测的几种机器学习方法,如支持向量机(SVM)、逻辑回归(LR)和自适应增强(AdaBoost)。使用公开可用(PTB-ECG、MIT-BIH)数据集中受试者的心电图信号,从准确率、F1分数和曲线下面积(AUC)值方面评估了所开发模型的性能。根据这些性能指标对模型进行了排名,发现AdaBoost和LR分类器分别位居第一和第二。基于多数投票原则对这两个模型进行了集成,并确定了该集成模型的性能指标。总体而言,观察到所提出的集成模型在准确率、F1分数和AUC方面,对于PTB-ECG数据集分别展现出0.946、0.949和0.951的最佳心脏病检测性能,对于MIT-BIH数据集分别为0.921、0.9 and 0.950。所提出的方法也可用于以ICG、MCG和HS信号为输入的心脏病分类。此外,所提出的方法还可应用于其他疾病的检测。