Suppr超能文献

深度提升:用于图像标注的深度特定标签特征学习

Deep-LIFT: Deep Label-Specific Feature Learning for Image Annotation.

作者信息

Li Junbing, Zhang Changqing, Zhou Joey Tianyi, Fu Huazhu, Xia Shuyin, Hu Qinghua

出版信息

IEEE Trans Cybern. 2022 Aug;52(8):7732-7741. doi: 10.1109/TCYB.2021.3049630. Epub 2022 Jul 19.

Abstract

Image annotation aims to jointly predict multiple tags for an image. Although significant progress has been achieved, existing approaches usually overlook aligning specific labels and their corresponding regions due to the weak supervised information (i.e., "bag of labels" for regions), thus failing to explicitly exploit the discrimination from different classes. In this article, we propose the deep label-specific feature (Deep-LIFT) learning model to build the explicit and exact correspondence between the label and the local visual region, which improves the effectiveness of feature learning and enhances the interpretability of the model itself. Deep-LIFT extracts features for each label by aligning each label and its region. Specifically, Deep-LIFTs are achieved through learning multiple correlation maps between image convolutional features and label embeddings. Moreover, we construct two variant graph convolutional networks (GCNs) to further capture the interdependency among labels. Empirical studies on benchmark datasets validate that the proposed model achieves superior performance on multilabel classification over other existing state-of-the-art methods.

摘要

图像标注旨在为一幅图像联合预测多个标签。尽管已取得显著进展,但由于监督信息薄弱(即区域的“标签袋”),现有方法通常忽略将特定标签与其相应区域对齐,从而无法明确利用不同类别之间的区分性。在本文中,我们提出深度特定标签特征(Deep-LIFT)学习模型,以建立标签与局部视觉区域之间明确且精确的对应关系,这提高了特征学习的有效性并增强了模型本身的可解释性。Deep-LIFT通过对齐每个标签及其区域来为每个标签提取特征。具体而言,通过学习图像卷积特征与标签嵌入之间的多个相关映射来实现深度特定标签特征。此外,我们构建了两个变体图卷积网络(GCN)以进一步捕捉标签之间的相互依赖性。在基准数据集上的实证研究验证了所提出的模型在多标签分类方面比其他现有最先进方法具有更优的性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验