Pasolli Edoardo, Melgani Farid
Department of Information Engineering and Computer Science, University of Trento, 38123 Trento, Italy.
IEEE Trans Inf Technol Biomed. 2010 Nov;14(6):1405-16. doi: 10.1109/TITB.2010.2048922.
In this paper, we present three active learning strategies for the classification of electrocardiographic (ECG) signals. Starting from a small and suboptimal training set, these learning strategies select additional beat samples from a large set of unlabeled data. These samples are labeled manually, and then added to the training set. The entire procedure is iterated until the construction of a final training set representative of the considered classification problem. The proposed methods are based on support vector machine classification and on the: 1) margin sampling; 2) posterior probability; and 3) query by committee principles, respectively. To illustrate their performance, we conducted an experimental study based on both simulated data and real ECG signals from the MIT-BIH arrhythmia database. In general, the obtained results show that the proposed strategies exhibit a promising capability to select samples that are significant for the classification process, i.e., to boost the accuracy of the classification process while minimizing the number of involved labeled samples.
在本文中,我们提出了三种用于心电图(ECG)信号分类的主动学习策略。从一个小的、次优的训练集开始,这些学习策略从大量未标记数据中选择额外的搏动样本。这些样本经过人工标注后,再添加到训练集中。整个过程反复进行,直到构建出一个代表所考虑分类问题的最终训练集。所提出的方法分别基于支持向量机分类以及:1)边界采样;2)后验概率;3)委员会查询原则。为了说明它们的性能,我们基于模拟数据和来自麻省理工学院 - 贝斯以色列女执事医疗中心(MIT - BIH)心律失常数据库的真实ECG信号进行了一项实验研究。总体而言,所获得的结果表明,所提出的策略具有选出对分类过程有重要意义的样本的潜力,即能在最小化所涉及的标记样本数量的同时提高分类过程的准确性。