Hong Yuanduo, Pan Huihui, Jia Yisong, Sun Weichao, Gao Huijun
IEEE Trans Neural Netw Learn Syst. 2025 Mar;36(3):3904-3915. doi: 10.1109/TNNLS.2022.3169779. Epub 2025 Feb 28.
Deep feature fusion plays a significant role in the strong learning ability of convolutional neural networks (CNNs) for computer vision tasks. Recently, works continually demonstrate the advantages of efficient aggregation strategy and some of them refer to multiscale representations. In this article, we describe a novel network architecture for high-level computer vision tasks where densely connected feature fusion provides multiscale representations for the residual network. We term our method the ResDNet which is a simple and efficient backbone made up of sequential ResDNet modules containing the variants of dense blocks named sliding dense blocks (SDBs). Compared with DenseNet, ResDNet enhances the feature fusion and reduces the redundancy by shallower densely connected architectures. Experimental results on three classification benchmarks including CIFAR-10, CIFAR-100, and ImageNet demonstrate the effectiveness of ResDNet. ResDNet always outperforms DenseNet using much less computation on CIFAR-100. On ImageNet, ResDNet-B-129 achieves 1.94% and 0.89% top-1 accuracy improvement over ResNet-50 and DenseNet-201 with similar complexity. Besides, ResDNet with more than 1000 layers achieves remarkable accuracy on CIFAR compared with other state-of-the-art results. Based on MMdetection implementation of RetinaNet, ResDNet-B-129 improves mAP from 36.3 to 39.5 compared with ResNet-50 on COCO dataset.
深度特征融合在卷积神经网络(CNN)强大的计算机视觉任务学习能力中发挥着重要作用。最近,研究不断证明高效聚合策略的优势,其中一些涉及多尺度表示。在本文中,我们描述了一种用于高级计算机视觉任务的新型网络架构,其中密集连接的特征融合为残差网络提供多尺度表示。我们将我们的方法称为ResDNet,它是一种简单高效的主干网络,由包含名为滑动密集块(SDB)的密集块变体的顺序ResDNet模块组成。与DenseNet相比,ResDNet通过更浅的密集连接架构增强了特征融合并减少了冗余。在包括CIFAR-10、CIFAR-100和ImageNet在内的三个分类基准上的实验结果证明了ResDNet的有效性。在CIFAR-100上,ResDNet始终以少得多的计算量优于DenseNet。在ImageNet上,ResDNet-B-129在复杂度相似的情况下,比ResNet-50和DenseNet-201的top-1准确率分别提高了1.94%和0.89%。此外,与其他最新结果相比,具有1000多层的ResDNet在CIFAR上实现了显著的准确率。基于RetinaNet的MMdetection实现,在COCO数据集上,与ResNet-50相比,ResDNet-B-129将平均精度均值(mAP)从36.3提高到39.5。