Suppr超能文献

学习多实例深度判别模式进行图像分类。

Learning Multi-Instance Deep Discriminative Patterns for Image Classification.

出版信息

IEEE Trans Image Process. 2017 Jul;26(7):3385-3396. doi: 10.1109/TIP.2016.2642781. Epub 2016 Dec 21.

Abstract

Finding an effective and efficient representation is very important for image classification. The most common approach is to extract a set of local descriptors, and then aggregate them into a high-dimensional, more semantic feature vector, like unsupervised bag-of-features and weakly supervised part-based models. The latter one is usually more discriminative than the former due to the use of information from image labels. In this paper, we propose a weakly supervised strategy that using multi-instance learning (MIL) to learn discriminative patterns for image representation. Specially, we extend traditional multi-instance methods to explicitly learn more than one patterns in positive class, and find the "most positive" instance for each pattern. Furthermore, as the positiveness of instance is treated as a continuous variable, we can use stochastic gradient decent to maximize the margin between different patterns meanwhile considering MIL constraints. To make the learned patterns more discriminative, local descriptors extracted by deep convolutional neural networks are chosen instead of hand-crafted descriptors. Some experimental results are reported on several widely used benchmarks (Action 40, Caltech 101, Scene 15, MIT-indoor, SUN 397), showing that our method can achieve very remarkable performance.

摘要

对于图像分类,找到一种有效且高效的表示方法非常重要。最常见的方法是提取一组局部描述符,然后将它们聚合为高维、更具语义的特征向量,例如无监督的特征袋和弱监督的基于部分的模型。由于利用了图像标签的信息,后者通常比前者更具判别力。在本文中,我们提出了一种弱监督策略,使用多实例学习(MIL)来学习图像表示的判别模式。具体来说,我们将传统的多实例方法扩展到在正类中显式学习多个模式,并为每个模式找到“最正”的实例。此外,由于实例的正性被视为连续变量,我们可以使用随机梯度下降来最大化不同模式之间的边缘,同时考虑 MIL 约束。为了使学习到的模式更具判别力,我们选择了由深度卷积神经网络提取的局部描述符,而不是手工制作的描述符。在几个广泛使用的基准(Action 40、Caltech 101、Scene 15、MIT-indoor、SUN 397)上进行了一些实验,结果表明我们的方法可以取得非常显著的性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验