用于结构化数据的半监督多标签多实例学习

Semisupervised, Multilabel, Multi-Instance Learning for Structured Data.

作者信息

Soleimani Hossein, Miller David J

机构信息

School of Electrical Engineering and Computer Science, Pennsylvania State University, University Park, PA 16802, U.S.A.

出版信息

Neural Comput. 2017 Apr;29(4):1053-1102. doi: 10.1162/NECO_a_00939. Epub 2017 Jan 17.

DOI:10.1162/NECO_a_00939

PMID:28095193

Abstract

Many classification tasks require both labeling objects and determining label associations for parts of each object. Example applications include labeling segments of images or determining relevant parts of a text document when the training labels are available only at the image or document level. This task is usually referred to as multi-instance (MI) learning, where the learner typically receives a collection of labeled (or sometimes unlabeled) bags, each containing several segments (instances). We propose a semisupervised MI learning method for multilabel classification. Most MI learning methods treat instances in each bag as independent and identically distributed samples. However, in many practical applications, instances are related to each other and should not be considered independent. Our model discovers a latent low-dimensional space that captures structure within each bag. Further, unlike many other MI learning methods, which are primarily developed for binary classification, we model multiple classes jointly, thus also capturing possible dependencies between different classes. We develop our model within a semisupervised framework, which leverages both labeled and, typically, a larger set of unlabeled bags for training. We develop several efficient inference methods for our model. We first introduce a Markov chain Monte Carlo method for inference, which can handle arbitrary relations between bag labels and instance labels, including the standard hard-max MI assumption. We also develop an extension of our model that uses stochastic variational Bayes methods for inference, and thus scales better to massive data sets. Experiments show that our approach outperforms several MI learning and standard classification methods on both bag-level and instance-level label prediction. All code for replicating our experiments is available from https://github.com/hsoleimani/MLTM .

摘要

许多分类任务既需要对对象进行标注，又需要确定每个对象各部分的标注关联。示例应用包括在训练标签仅在图像或文档级别可用时，对图像片段进行标注或确定文本文档的相关部分。此任务通常称为多实例（MI）学习，其中学习者通常会收到一组带标签的（有时是无标签的）包，每个包包含几个片段（实例）。我们提出了一种用于多标签分类的半监督MI学习方法。大多数MI学习方法将每个包中的实例视为独立且同分布的样本。然而，在许多实际应用中，实例相互关联，不应被视为独立的。我们的模型发现一个潜在的低维空间，该空间捕获每个包内的结构。此外，与许多主要为二分类开发的其他MI学习方法不同，我们对多个类别进行联合建模，从而也捕获不同类别之间可能的依赖关系。我们在半监督框架内开发我们的模型，该框架利用带标签的以及通常更大的一组无标签包进行训练。我们为我们的模型开发了几种有效的推理方法。我们首先引入一种用于推理的马尔可夫链蒙特卡罗方法，该方法可以处理包标签和实例标签之间的任意关系，包括标准的硬最大化MI假设。我们还开发了我们模型的一个扩展，该扩展使用随机变分贝叶斯方法进行推理，因此能更好地扩展到大规模数据集。实验表明，我们的方法在包级和实例级标签预测方面均优于几种MI学习和标准分类方法。可从https://github.com/hsoleimani/MLTM获取用于复制我们实验的所有代码。

相似文献

Semisupervised, Multilabel, Multi-Instance Learning for Structured Data.用于结构化数据的半监督多标签多实例学习

Neural Comput. 2017 Apr;29(4):1053-1102. doi: 10.1162/NECO_a_00939. Epub 2017 Jan 17.

Dynamic Programming for Instance Annotation in Multi-Instance Multi-Label Learning.动态规划在多实例多标签学习中的实例标注。

IEEE Trans Pattern Anal Mach Intell. 2017 Dec;39(12):2381-2394. doi: 10.1109/TPAMI.2017.2647944. Epub 2017 Jan 5.

Learning From Weakly Labeled Data Based on Manifold Regularized Sparse Model.基于流形正则化稀疏模型的弱标注数据学习。

IEEE Trans Cybern. 2022 May;52(5):3841-3854. doi: 10.1109/TCYB.2020.3015269. Epub 2022 May 19.

Improving Web image search by bag-based reranking.基于包的重新排序改进网络图像搜索。

IEEE Trans Image Process. 2011 Nov;20(11):3280-90. doi: 10.1109/TIP.2011.2159227. Epub 2011 Jun 9.

Convex formulation of multiple instance learning from positive and unlabeled bags.从正例和未标记袋中进行多示例学习的凸公式化。

Neural Netw. 2018 Sep;105:132-141. doi: 10.1016/j.neunet.2018.05.001. Epub 2018 May 24.

Structured max-margin learning for inter-related classifier training and multilabel image annotation.面向相关分类器训练和多标签图像标注的结构化最大间隔学习。

IEEE Trans Image Process. 2011 Mar;20(3):837-54. doi: 10.1109/TIP.2010.2073476. Epub 2010 Sep 7.

Multiview Multi-Instance Multilabel Active Learning.

IEEE Trans Neural Netw Learn Syst. 2022 Sep;33(9):4311-4321. doi: 10.1109/TNNLS.2021.3056436. Epub 2022 Aug 31.

A Transfer Learning-Based Multi-Instance Learning Method With Weak Labels.一种基于迁移学习的带有弱标签的多示例学习方法。

IEEE Trans Cybern. 2022 Jan;52(1):287-300. doi: 10.1109/TCYB.2020.2973450. Epub 2022 Jan 11.

MILES: multiple-instance learning via embedded instance selection.MILES：通过嵌入式实例选择实现的多实例学习

IEEE Trans Pattern Anal Mach Intell. 2006 Dec;28(12):1931-47. doi: 10.1109/TPAMI.2006.248.

Learning Semisupervised Multilabel Fully Convolutional Network for Hierarchical Object Parsing.用于层次化目标解析的半监督多标签全卷积网络学习

IEEE Trans Neural Netw Learn Syst. 2020 Jul;31(7):2500-2509. doi: 10.1109/TNNLS.2019.2931183. Epub 2019 Dec 23.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

用于结构化数据的半监督多标签多实例学习

Semisupervised, Multilabel, Multi-Instance Learning for Structured Data.

作者信息

机构信息

出版信息

相似文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献