Pan Jing, Sun Hanqing, Song Zhanjie, Han Jungong
School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China.
School of Mathematics, Tianjin University, Tianjin 300072, China.
Sensors (Basel). 2019 Jul 14;19(14):3111. doi: 10.3390/s19143111.
Downsampling input images is a simple trick to speed up visual object-detection algorithms, especially on robotic vision and applied mobile vision systems. However, this trick comes with a significant decline in accuracy. In this paper, dual-resolution dual-path Convolutional Neural Networks (CNNs), named DualNets, are proposed to bump up the accuracy of those detection applications. In contrast to previous methods that simply downsample the input images, DualNets explicitly take dual inputs in different resolutions and extract complementary visual features from these using dual CNN paths. The two paths in a DualNet are a backbone path and an auxiliary path that accepts larger inputs and then rapidly downsamples them to relatively small feature maps. With the help of the carefully designed auxiliary CNN paths in DualNets, auxiliary features are extracted from the larger input with controllable computation. Auxiliary features are then fused with the backbone features using a proposed progressive residual fusion strategy to enrich feature representation.This architecture, as the feature extractor, is further integrated with the Single Shot Detector (SSD) to accomplish latency-sensitive visual object-detection tasks. We evaluate the resulting detection pipeline on Pascal VOC and MS COCO benchmarks. Results show that the proposed DualNets can raise the accuracy of those CNN detection applications that are sensitive to computation payloads.
对输入图像进行下采样是一种加速视觉目标检测算法的简单技巧,尤其适用于机器人视觉和应用移动视觉系统。然而,这种技巧会导致准确率显著下降。在本文中,我们提出了双分辨率双路径卷积神经网络(CNN),即DualNets,以提高这些检测应用的准确率。与之前简单对输入图像进行下采样的方法不同,DualNets明确采用不同分辨率的双输入,并使用双CNN路径从这些输入中提取互补的视觉特征。DualNet中的两条路径分别是主干路径和辅助路径,辅助路径接受更大的输入,然后迅速将其下采样为相对较小的特征图。借助DualNets中精心设计的辅助CNN路径,可以从更大的输入中提取辅助特征,且计算量可控。然后,使用提出的渐进残差融合策略将辅助特征与主干特征融合,以丰富特征表示。作为特征提取器,这种架构进一步与单阶段检测器(SSD)集成,以完成对延迟敏感的视觉目标检测任务。我们在Pascal VOC和MS COCO基准上评估了由此产生的检测管道。结果表明,所提出的DualNets可以提高那些对计算负载敏感的CNN检测应用的准确率。