Suppr超能文献

基于 NAS 门控卷积模块和胶囊注意力模块的目标探测器。

Object detectors involving a NAS-gate convolutional module and capsule attention module.

机构信息

Division of Mechanical and Biomedical Engineering, Graduate Program in System Health Science and Engineering, Ewha Womans University, Seoul, 03760, Republic of Korea.

出版信息

Sci Rep. 2022 Mar 10;12(1):3916. doi: 10.1038/s41598-022-07898-7.

Abstract

Several state-of-the-art object detectors have demonstrated outstanding performances by optimizing feature representation through modification of the backbone architecture and exploitation of a feature pyramid. To determine the effectiveness of this approach, we explore the modification of object detectors' backbone and feature pyramid by utilizing Neural Architecture Search (NAS) and Capsule Network. We introduce two modules, namely, NAS-gate convolutional module and Capsule Attention module. The NAS-gate convolutional module optimizes standard convolution in a backbone network based on differentiable architecture search cooperation with multiple convolution conditions to overcome object scale variation problems. The Capsule Attention module exploits the strong spatial relationship encoding ability of the capsule network to generate a spatial attention mask, which emphasizes important features and suppresses unnecessary features in the feature pyramid, in order to optimize the feature representation and localization capability of the detectors. Experimental results indicate that the NAS-gate convolutional module can alleviate the object scale variation problem and the Capsule Attention network can help to avoid inaccurate localization. Next, we introduce NASGC-CapANet, which incorporates the two modules, i.e., a NAS-gate convolutional module and capsule attention module. Results of comparisons against state-of-the-art object detectors on the MS COCO val-2017 dataset demonstrate that NASGC-CapANet-based Faster R-CNN significantly outperforms the baseline Faster R-CNN with a ResNet-50 backbone and a ResNet-101 backbone by mAPs of 2.7% and 2.0%, respectively. Furthermore, the NASGC-CapANet-based Cascade R-CNN achieves a box mAP of 43.8% on the MS COCO test-dev dataset.

摘要

几种最先进的目标检测方法通过修改骨干网络结构和利用特征金字塔来优化特征表示,展示了出色的性能。为了确定这种方法的有效性,我们通过使用神经架构搜索 (NAS) 和胶囊网络来探索对目标检测器骨干和特征金字塔的修改。我们引入了两个模块,即 NAS 门控卷积模块和胶囊注意力模块。NAS 门控卷积模块基于可微分架构搜索与多个卷积条件合作,优化骨干网络中的标准卷积,以克服目标尺度变化问题。胶囊注意力模块利用胶囊网络的强大空间关系编码能力生成空间注意力掩模,该掩模在特征金字塔中强调重要特征并抑制不必要的特征,从而优化检测器的特征表示和定位能力。实验结果表明,NAS 门控卷积模块可以减轻目标尺度变化问题,胶囊注意力网络可以帮助避免定位不准确。接下来,我们引入了包含两个模块的 NASGC-CapANet,即 NAS 门控卷积模块和胶囊注意力模块。在 MS COCO val-2017 数据集上与最先进的目标检测方法进行比较的结果表明,基于 NASGC-CapANet 的 Faster R-CNN 显著优于具有 ResNet-50 骨干和 ResNet-101 骨干的基线 Faster R-CNN,mAP 分别提高了 2.7%和 2.0%。此外,基于 NASGC-CapANet 的级联 R-CNN 在 MS COCO test-dev 数据集上实现了 43.8%的框 mAP。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c59a/8913793/ab110f088b6e/41598_2022_7898_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验