Institute of Neural Information Processing, University of Ulm, D-89069 Ulm, Germany.
Neural Netw. 2010 May;23(4):497-509. doi: 10.1016/j.neunet.2009.09.001. Epub 2009 Sep 17.
Supervised learning requires a large amount of labeled data, but the data labeling process can be expensive and time consuming, as it requires the efforts of human experts. Co-Training is a semi-supervised learning method that can reduce the amount of required labeled data through exploiting the available unlabeled data to improve the classification accuracy. It is assumed that the patterns are represented by two or more redundantly sufficient feature sets (views) and these views are independent given the class. On the other hand, most of the real-world pattern recognition tasks involve a large number of categories which may make the task difficult. The tree-structured approach is an output space decomposition method where a complex multi-class problem is decomposed into a set of binary sub-problems. In this paper, we propose two learning architectures to combine the merits of the tree-structured approach and Co-Training. We show that our architectures are especially useful for classification tasks that involve a large number of classes and a small amount of labeled data where the single-view tree-structured approach does not perform well alone but when combined with Co-Training, it can exploit effectively the independent views and the unlabeled data to improve the recognition accuracy.
监督学习需要大量标记数据,但数据标记过程可能很昂贵且耗时,因为它需要人类专家的努力。协同训练是一种半监督学习方法,它可以通过利用可用的未标记数据来提高分类准确性,从而减少所需的标记数据量。假设模式由两个或更多冗余充分的特征集(视图)表示,并且这些视图在给定类别时是独立的。另一方面,大多数现实世界的模式识别任务涉及大量类别,这可能使任务变得困难。树结构方法是一种输出空间分解方法,其中复杂的多类问题被分解为一组二进制子问题。在本文中,我们提出了两种学习架构来结合树结构方法和协同训练的优点。我们表明,我们的架构对于涉及大量类别和少量标记数据的分类任务特别有用,其中单视图树结构方法本身表现不佳,但与协同训练结合使用时,它可以有效地利用独立视图和未标记数据来提高识别准确性。