Khozeimeh Fahime, Alizadehsani Roohallah, Shirani Milad, Tartibi Mehrzad, Shoeibi Afshin, Alinejad-Rokny Hamid, Harlapur Chandrashekhar, Sultanzadeh Sayed Javed, Khosravi Abbas, Nahavandi Saeid, Tan Ru-San, Acharya U Rajendra
Institute for Intelligent Systems Research and Innovation (IISRI), Deakin University, Geelong, Australia.
Institute for Intelligent Systems Research and Innovation (IISRI), Deakin University, Geelong, Australia.
Comput Biol Med. 2023 May;158:106841. doi: 10.1016/j.compbiomed.2023.106841. Epub 2023 Mar 31.
Invasive angiography is the reference standard for coronary artery disease (CAD) diagnosis but is expensive and associated with certain risks. Machine learning (ML) using clinical and noninvasive imaging parameters can be used for CAD diagnosis to avoid the side effects and cost of angiography. However, ML methods require labeled samples for efficient training. The labeled data scarcity and high labeling costs can be mitigated by active learning. This is achieved through selective query of challenging samples for labeling. To the best of our knowledge, active learning has not been used for CAD diagnosis yet. An Active Learning with Ensemble of Classifiers (ALEC) method is proposed for CAD diagnosis, consisting of four classifiers. Three of these classifiers determine whether a patient's three main coronary arteries are stenotic or not. The fourth classifier predicts whether the patient has CAD or not. ALEC is first trained using labeled samples. For each unlabeled sample, if the outputs of the classifiers are consistent, the sample along with its predicted label is added to the pool of labeled samples. Inconsistent samples are manually labeled by medical experts before being added to the pool. The training is performed once more using the samples labeled so far. The interleaved phases of labeling and training are repeated until all samples are labeled. Compared with 19 other active learning algorithms, ALEC combined with a support vector machine classifier attained superior performance with 97.01% accuracy. Our method is justified mathematically as well. We also comprehensively analyze the CAD dataset used in this paper. As part of dataset analysis, features pairwise correlation is computed. The top 15 features contributing to CAD and stenosis of the three main coronary arteries are determined. The relationship between stenosis of the main arteries is presented using conditional probabilities. The effect of considering the number of stenotic arteries on sample discrimination is investigated. The discrimination power over dataset samples is visualized, assuming each of the three main coronary arteries as a sample label and considering the two remaining arteries as sample features.
有创血管造影是冠状动脉疾病(CAD)诊断的参考标准,但成本高昂且存在一定风险。利用临床和无创成像参数的机器学习(ML)可用于CAD诊断,以避免血管造影的副作用和成本。然而,ML方法需要有标签的样本进行有效训练。主动学习可以缓解有标签数据稀缺和高标记成本的问题。这是通过选择性查询具有挑战性的样本进行标记来实现的。据我们所知,主动学习尚未用于CAD诊断。本文提出了一种用于CAD诊断的分类器集成主动学习(ALEC)方法,该方法由四个分类器组成。其中三个分类器确定患者的三条主要冠状动脉是否狭窄。第四个分类器预测患者是否患有CAD。ALEC首先使用有标签的样本进行训练。对于每个无标签样本,如果分类器的输出一致,则将该样本及其预测标签添加到有标签样本池中。不一致的样本在添加到样本池之前由医学专家手动标记。使用到目前为止标记的样本再次进行训练。标记和训练的交错阶段重复进行,直到所有样本都被标记。与其他19种主动学习算法相比,结合支持向量机分类器的ALEC取得了97.01%的准确率,性能更优。我们的方法在数学上也是合理的。我们还对本文中使用的CAD数据集进行了全面分析。作为数据集分析的一部分,计算了特征之间的成对相关性。确定了导致CAD和三条主要冠状动脉狭窄的前15个特征。使用条件概率展示了主要动脉狭窄之间的关系。研究了考虑狭窄动脉数量对样本区分的影响。假设将三条主要冠状动脉中的每一条作为样本标签,并将其余两条动脉作为样本特征,可视化了对数据集样本的区分能力。