Suppr超能文献

实例分类问题中多实例主动学习的包级聚合

Bag-Level Aggregation for Multiple-Instance Active Learning in Instance Classification Problems.

作者信息

Carbonneau Marc-Andre, Granger Eric, Gagnon Ghyslain

出版信息

IEEE Trans Neural Netw Learn Syst. 2019 May;30(5):1441-1451. doi: 10.1109/TNNLS.2018.2869164. Epub 2018 Oct 1.

Abstract

A growing number of applications, e.g., video surveillance and medical image analysis, require training recognition systems from large amounts of weakly annotated data, while some targeted interactions with a domain expert are allowed to improve the training process. In such cases, active learning (AL) can reduce labeling costs for training a classifier by querying the expert to provide the labels of most informative instances. This paper focuses on AL methods for instance classification problems in multiple instance learning (MIL), where data are arranged into sets, called bags, which are weakly labeled. Most AL methods focus on single-instance learning problems. These methods are not suitable for MIL problems because they cannot account for the bag structure of data. In this paper, new methods for bag-level aggregation of instance informativeness are proposed for multiple instance AL (MIAL). The aggregated informativeness method identifies the most informative instances based on classifier uncertainty and queries bags incorporating the most information. The other proposed method, called cluster-based aggregative sampling, clusters data hierarchically in the instance space. The informativeness of instances is assessed by considering bag labels, inferred instance labels, and the proportion of labels that remain to be discovered in clusters. Both proposed methods significantly outperform reference methods in extensive experiments using benchmark data from several application domains. Results indicate that using an appropriate strategy to address MIAL problems yields a significant reduction in the number of queries needed to achieve the same level of performance as single-instance AL methods.

摘要

越来越多的应用,例如视频监控和医学图像分析,需要从大量弱标注数据中训练识别系统,同时允许与领域专家进行一些有针对性的交互以改进训练过程。在这种情况下,主动学习(AL)可以通过向专家查询以提供最具信息性实例的标签来降低训练分类器的标注成本。本文重点关注多实例学习(MIL)中实例分类问题的主动学习方法,其中数据被安排成称为包的集合,这些包被弱标注。大多数主动学习方法关注单实例学习问题。这些方法不适用于多实例学习问题,因为它们无法考虑数据的包结构。本文针对多实例主动学习(MIAL)提出了用于包级实例信息聚合的新方法。聚合信息方法基于分类器不确定性识别最具信息性的实例,并查询包含最多信息的包。另一种提出的方法,称为基于聚类的聚合采样,在实例空间中对数据进行层次聚类。通过考虑包标签、推断的实例标签以及聚类中有待发现的标签比例来评估实例的信息性。在使用来自几个应用领域的基准数据进行的广泛实验中,这两种提出的方法都明显优于参考方法。结果表明,使用适当的策略来解决多实例主动学习问题可以显著减少达到与单实例主动学习方法相同性能水平所需的查询数量。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验