Suppr超能文献

SLED:用于多标签图像标注的语义标签嵌入字典表示。

SLED: Semantic Label Embedding Dictionary Representation for Multilabel Image Annotation.

出版信息

IEEE Trans Image Process. 2015 Sep;24(9):2746-59. doi: 10.1109/TIP.2015.2428055. Epub 2015 Apr 29.

Abstract

Most existing methods on weakly supervised image annotation rely on jointly unsupervised feature representation, the components of which are not directly correlated with specific labels. In practical cases, however, there is a big gap between the training and the testing data, say the label combination of the testing data is not always consistent with that of the training. To bridge the gap, this paper presents a semantic label embedding dictionary representation that not only achieves the discriminative feature representation for each label in the image, but also mines the semantic relevance between co-occurrence labels for context information. More specifically, to enhance the discriminative representation of labels, the training data is first divided into a set of overlapped groups by graph shift based on the exclusive label graph. Afterward, given a group of exclusive labels, we try to learn multiple label-specific dictionaries to explicitly decorrelate the feature representation of each label. A joint optimization approach is proposed according to the Fisher discrimination criterion for seeking its solution. Then, to discover the context information hidden in the co-occurrence labels, we explore the semantic relationship between visual words in dictionaries and labels in a multitask learning way with respect to the reconstruction coefficients of the training data. In the annotation stage, with the discriminative dictionaries and exclusive label groups as well as a group sparsity constraint, the reconstruction coefficients of a test image can be easily obtained. Finally, we introduce a label propagation scheme to compute the score of each label for the test image based on its reconstruction coefficients. Experimental results on three challenging data sets demonstrate that our proposed method leads to significant performance gains over existing methods.

摘要

大多数现有的弱监督图像标注方法都依赖于联合无监督的特征表示,其组件与特定标签没有直接关联。然而,在实际情况下,训练数据和测试数据之间存在很大的差距,例如,测试数据的标签组合并不总是与训练数据的标签组合一致。为了弥合这一差距,本文提出了一种语义标签嵌入字典表示,它不仅实现了图像中每个标签的有判别性的特征表示,而且还挖掘了共现标签之间的语义相关性以获取上下文信息。更具体地说,为了增强标签的判别表示,首先通过基于独占标签图的图移位将训练数据划分为一组重叠的组。然后,对于一组独占标签,我们尝试学习多个标签特定的字典,以明确解相关每个标签的特征表示。根据 Fisher 判别准则提出了一种联合优化方法来寻找其解。然后,为了发现共现标签中隐藏的上下文信息,我们以训练数据的重构系数为目标,以多任务学习的方式探索字典中的视觉单词与标签之间的语义关系。在标注阶段,使用判别字典和独占标签组以及组稀疏约束,可以很容易地获得测试图像的重构系数。最后,我们引入了一种标签传播方案,根据其重构系数为测试图像计算每个标签的得分。在三个具有挑战性的数据集上的实验结果表明,与现有方法相比,我们提出的方法显著提高了性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验