Suppr超能文献

多视图矩阵补全在多标签图像分类中的应用。

Multiview matrix completion for multilabel image classification.

出版信息

IEEE Trans Image Process. 2015 Aug;24(8):2355-68. doi: 10.1109/TIP.2015.2421309. Epub 2015 Apr 9.

Abstract

There is growing interest in multilabel image classification due to its critical role in web-based image analytics-based applications, such as large-scale image retrieval and browsing. Matrix completion (MC) has recently been introduced as a method for transductive (semisupervised) multilabel classification, and has several distinct advantages, including robustness to missing data and background noise in both feature and label space. However, it is limited by only considering data represented by a single-view feature, which cannot precisely characterize images containing several semantic concepts. To utilize multiple features taken from different views, we have to concatenate the different features as a long vector. However, this concatenation is prone to over-fitting and often leads to very high time complexity in MC-based image classification. Therefore, we propose to weightedly combine the MC outputs of different views, and present the multiview MC (MVMC) framework for transductive multilabel image classification. To learn the view combination weights effectively, we apply a cross-validation strategy on the labeled set. In particular, MVMC splits the labeled set into two parts, and predicts the labels of one part using the known labels of the other part. The predicted labels are then used to learn the view combination coefficients. In the learning process, we adopt the average precision (AP) loss, which is particular suitable for multilabel image classification, since the ranking-based criteria are critical for evaluating a multilabel classification system. A least squares loss formulation is also presented for the sake of efficiency, and the robustness of the algorithm based on the AP loss compared with the other losses is investigated. Experimental evaluation on two real-world data sets (PASCAL VOC' 07 and MIR Flickr) demonstrate the effectiveness of MVMC for transductive (semisupervised) multilabel image classification, and show that MVMC can exploit complementary properties of different features and output-consistent labels for improved multilabel image classification.

摘要

多标签图像分类由于在基于网络的图像分析应用中(如大规模图像检索和浏览)起着至关重要的作用,因此引起了越来越多的关注。矩阵补全 (MC) 最近被引入作为一种转导(半监督)多标签分类方法,它具有几个明显的优势,包括对特征空间和标签空间中的缺失数据和背景噪声的鲁棒性。然而,它仅限于考虑由单个视图特征表示的数据,因此无法精确地描述包含多个语义概念的图像。为了利用来自不同视图的多个特征,我们必须将不同的特征连接成一个长向量。然而,这种连接容易产生过拟合,并且通常会导致基于 MC 的图像分类中的时间复杂度非常高。因此,我们提出加权组合不同视图的 MC 输出,并提出用于转导多标签图像分类的多视图 MC (MVMC) 框架。为了有效地学习视图组合权重,我们在标记集上应用了交叉验证策略。特别是,MVMC 将标记集分为两部分,并使用另一部分的已知标记来预测一部分的标签。然后使用预测的标签来学习视图组合系数。在学习过程中,我们采用平均精度 (AP) 损失,这对于多标签图像分类特别适合,因为基于排序的标准对于评估多标签分类系统至关重要。还提出了一种最小二乘损失公式,以提高效率,并且还研究了基于 AP 损失的算法与其他损失相比的稳健性。在两个真实数据集(PASCAL VOC'07 和 MIR Flickr)上的实验评估表明,MVMC 对于转导(半监督)多标签图像分类是有效的,并且表明 MVMC 可以利用不同特征的互补属性和输出一致的标签来提高多标签图像分类的性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验