Soleimani Hossein, Miller David J
School of Electrical Engineering and Computer Science, Pennsylvania State University, University Park, PA 16802, U.S.A.
Neural Comput. 2017 Apr;29(4):1053-1102. doi: 10.1162/NECO_a_00939. Epub 2017 Jan 17.
Many classification tasks require both labeling objects and determining label associations for parts of each object. Example applications include labeling segments of images or determining relevant parts of a text document when the training labels are available only at the image or document level. This task is usually referred to as multi-instance (MI) learning, where the learner typically receives a collection of labeled (or sometimes unlabeled) bags, each containing several segments (instances). We propose a semisupervised MI learning method for multilabel classification. Most MI learning methods treat instances in each bag as independent and identically distributed samples. However, in many practical applications, instances are related to each other and should not be considered independent. Our model discovers a latent low-dimensional space that captures structure within each bag. Further, unlike many other MI learning methods, which are primarily developed for binary classification, we model multiple classes jointly, thus also capturing possible dependencies between different classes. We develop our model within a semisupervised framework, which leverages both labeled and, typically, a larger set of unlabeled bags for training. We develop several efficient inference methods for our model. We first introduce a Markov chain Monte Carlo method for inference, which can handle arbitrary relations between bag labels and instance labels, including the standard hard-max MI assumption. We also develop an extension of our model that uses stochastic variational Bayes methods for inference, and thus scales better to massive data sets. Experiments show that our approach outperforms several MI learning and standard classification methods on both bag-level and instance-level label prediction. All code for replicating our experiments is available from https://github.com/hsoleimani/MLTM .
许多分类任务既需要对对象进行标注,又需要确定每个对象各部分的标注关联。示例应用包括在训练标签仅在图像或文档级别可用时,对图像片段进行标注或确定文本文档的相关部分。此任务通常称为多实例(MI)学习,其中学习者通常会收到一组带标签的(有时是无标签的)包,每个包包含几个片段(实例)。我们提出了一种用于多标签分类的半监督MI学习方法。大多数MI学习方法将每个包中的实例视为独立且同分布的样本。然而,在许多实际应用中,实例相互关联,不应被视为独立的。我们的模型发现一个潜在的低维空间,该空间捕获每个包内的结构。此外,与许多主要为二分类开发的其他MI学习方法不同,我们对多个类别进行联合建模,从而也捕获不同类别之间可能的依赖关系。我们在半监督框架内开发我们的模型,该框架利用带标签的以及通常更大的一组无标签包进行训练。我们为我们的模型开发了几种有效的推理方法。我们首先引入一种用于推理的马尔可夫链蒙特卡罗方法,该方法可以处理包标签和实例标签之间的任意关系,包括标准的硬最大化MI假设。我们还开发了我们模型的一个扩展,该扩展使用随机变分贝叶斯方法进行推理,因此能更好地扩展到大规模数据集。实验表明,我们的方法在包级和实例级标签预测方面均优于几种MI学习和标准分类方法。可从https://github.com/hsoleimani/MLTM获取用于复制我们实验的所有代码。