IEEE Trans Cybern. 2014 Apr;44(4):500-15. doi: 10.1109/TCYB.2013.2257749. Epub 2013 May 16.
Multiple-instance learning (MIL) is a generalization of supervised learning that attempts to learn useful information from bags of instances. In MIL, the true labels of instances in positive bags are not available for training. This leads to a critical challenge, namely, handling the instances of which the labels are ambiguous (ambiguous instances). To deal with these ambiguous instances, we propose a novel MIL approach, called similarity-based multiple-instance learning (SMILE). Instead of eliminating a number of ambiguous instances in positive bags from training the classifier, as done in some previous MIL works, SMILE explicitly deals with the ambiguous instances by considering their similarity to the positive class and the negative class. Specifically, a subset of instances is selected from positive bags as the positive candidates and the remaining ambiguous instances are associated with two similarity weights, representing the similarity to the positive class and the negative class, respectively. The ambiguous instances, together with their similarity weights, are thereafter incorporated into the learning phase to build an extended SVM-based predictive classifier. A heuristic framework is employed to update the positive candidates and the similarity weights for refining the classification boundary. Experiments on real-world datasets show that SMILE demonstrates highly competitive classification accuracy and shows less sensitivity to labeling noise than the existing MIL methods.
多示例学习(MIL)是监督学习的一种推广,旨在从实例包中学习有用的信息。在 MIL 中,正例袋中实例的真实标签无法用于训练。这导致了一个关键的挑战,即处理标签模糊的实例(模糊实例)。为了处理这些模糊实例,我们提出了一种新的 MIL 方法,称为基于相似性的多示例学习(SMILE)。与一些先前的 MIL 工作中从训练分类器中消除正例袋中大量模糊实例的方法不同,SMILE 通过考虑模糊实例与正类和负类的相似性来明确处理这些模糊实例。具体来说,从正例袋中选择一个实例子集作为正例候选,其余模糊实例与两个相似性权重相关联,分别代表与正类和负类的相似性。模糊实例及其相似性权重随后被纳入学习阶段,以构建扩展的基于 SVM 的预测分类器。采用启发式框架更新正例候选和相似性权重,以细化分类边界。在真实数据集上的实验表明,SMILE 表现出了非常有竞争力的分类准确性,并且比现有的 MIL 方法对标签噪声的敏感性更小。