Suppr超能文献

基于碎片注意力的好奇驱动显著目标检测。

Curiosity-Driven Salient Object Detection With Fragment Attention.

出版信息

IEEE Trans Image Process. 2022;31:5989-6001. doi: 10.1109/TIP.2022.3203605. Epub 2022 Sep 19.

Abstract

Recent deep learning based salient object detection methods with attention mechanisms have made great success. However, existing attention mechanisms can be generally separated into two categories. One part chooses to calculate weights indiscriminately, which yields computational redundancy. While one part focuses randomly on a small part of the images, such as hard attention, resulting in incorrectness owing to insufficiently targeted selection of a subset of tokens. To alleviate these problems, we design a Curiosity-driven Network (CNet) and a Curiosity-driven Learning Algorithm (CLA) based on fragment attention (FA) mechanism newly defined in this paper. FA imitates the process of cognition perception driven by human curiosity, and divides the degree of curiosity into three levels, i.e. curious, a little curious and not curious. These three levels correspond to five saliency degrees, including salient and non-salient, likewise salient and likewise non-salient, completely uncertain. With more knowledge gained by the network, CLA transforms the curiosity degree of each pixel to yield enhanced detail-enriched saliency maps. In order to extract more context-aware information of potential salient objects and make a better foundation for CLA, a high-level feature extraction module (HFEM) is further proposed. Based on the much better high-level features extracted by HFEM, FA can classify the curiosity degree for each pixel more reasonably and accurately. Extensive experiments on five popular datasets clearly demonstrate that our method outperforms the state-of-the-art approaches without any pre-processing operations or post-processing operations.

摘要

基于深度学习的显著目标检测方法与注意力机制最近取得了巨大的成功。然而,现有的注意力机制一般可以分为两类。一部分是无差别地计算权重,这导致了计算冗余。而另一部分则随机关注图像的一小部分,例如硬注意力,由于对令牌子集的选择不够有针对性,导致结果不正确。为了解决这些问题,我们设计了一个基于片段注意力(FA)机制的好奇心驱动网络(CNet)和好奇心驱动学习算法(CLA)。FA 模仿了人类好奇心驱动的认知感知过程,并将好奇心程度分为三个层次,即好奇、有点好奇和不感兴趣。这三个层次对应于五种显著度,包括显著和非显著、同样显著和同样非显著、完全不确定。随着网络获得更多的知识,CLA 将每个像素的好奇心程度转换为增强的细节丰富的显著图。为了提取更多潜在显著目标的上下文感知信息,并为 CLA 奠定更好的基础,进一步提出了一个高层特征提取模块(HFEM)。基于 HFEM 提取的更好的高层特征,FA 可以更合理、更准确地对每个像素的好奇心程度进行分类。在五个流行的数据集上进行的广泛实验清楚地表明,我们的方法在没有任何预处理或后处理操作的情况下,优于最先进的方法。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验