IEEE Trans Cybern. 2022 Jul;52(7):6131-6142. doi: 10.1109/TCYB.2021.3051350. Epub 2022 Jul 4.
Recent progress on salient object detection mainly aims at exploiting how to effectively integrate multiscale convolutional features in convolutional neural networks (CNNs). Many popular methods impose deep supervision to perform side-output predictions that are linearly aggregated for final saliency prediction. In this article, we theoretically and experimentally demonstrate that linear aggregation of side-output predictions is suboptimal, and it only makes limited use of the side-output information obtained by deep supervision. To solve this problem, we propose deeply supervised nonlinear aggregation (DNA) for better leveraging the complementary information of various side-outputs. Compared with existing methods, it: 1) aggregates side-output features rather than predictions and 2) adopts nonlinear instead of linear transformations. Experiments demonstrate that DNA can successfully break through the bottleneck of the current linear approaches. Specifically, the proposed saliency detector, a modified U-Net architecture with DNA, performs favorably against state-of-the-art methods on various datasets and evaluation metrics without bells and whistles.
近年来,显著目标检测的研究主要集中在如何有效地整合卷积神经网络(CNN)中的多尺度卷积特征。许多流行的方法采用深度监督来进行侧输出预测,这些预测被线性聚合以进行最终的显著度预测。在本文中,我们从理论和实验上证明了侧输出预测的线性聚合是次优的,它只对深度监督获得的侧输出信息进行了有限的利用。为了解决这个问题,我们提出了深度监督的非线性聚合(DNA),以更好地利用各种侧输出的互补信息。与现有方法相比,它:1)聚合侧输出特征而不是预测值,2)采用非线性而不是线性变换。实验表明,DNA 可以成功突破当前线性方法的瓶颈。具体来说,带有 DNA 的改进 U-Net 结构的提出的显著度检测器在各种数据集和评估指标上都表现出色,无需花哨的 bells 和 whistles。