Liu Yun, Zhang Xin-Yu, Bian Jia-Wang, Zhang Le, Cheng Ming-Ming
IEEE Trans Image Process. 2021;30:3804-3814. doi: 10.1109/TIP.2021.3065239. Epub 2021 Mar 25.
Recent progress on salient object detection (SOD) mostly benefits from the explosive development of Convolutional Neural Networks (CNNs). However, much of the improvement comes with the larger network size and heavier computation overhead, which, in our view, is not mobile-friendly and thus difficult to deploy in practice. To promote more practical SOD systems, we introduce a novel Stereoscopically Attentive Multi-scale (SAM) module, which adopts a stereoscopic attention mechanism to adaptively fuse the features of various scales. Embarking on this module, we propose an extremely lightweight network, namely SAMNet, for SOD. Extensive experiments on popular benchmarks demonstrate that the proposed SAMNet yields comparable accuracy with state-of-the-art methods while running at a GPU speed of 343fps and a CPU speed of 5fps for 336 ×336 inputs with only 1.33M parameters. Therefore, SAMNet paves a new path towards SOD. The source code is available on the project page https://mmcheng.net/SAMNet/.
显著目标检测(SOD)的最新进展大多得益于卷积神经网络(CNN)的迅猛发展。然而,大部分改进都伴随着更大的网络规模和更重的计算开销,在我们看来,这对移动设备不友好,因此在实际中难以部署。为了推动更实用的SOD系统,我们引入了一种新颖的立体注意力多尺度(SAM)模块,该模块采用立体注意力机制来自适应融合各种尺度的特征。基于此模块,我们提出了一种用于SOD的超轻量级网络,即SAMNet。在流行基准上进行的大量实验表明,所提出的SAMNet在运行336×336输入时,GPU速度为343fps,CPU速度为5fps,仅具有133万个参数,却能达到与当前最先进方法相当的准确率。因此,SAMNet为SOD开辟了一条新路径。源代码可在项目页面https://mmcheng.net/SAMNet/上获取。