• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于 Swin 变形 Transformer-BiPAFPN-YOLOX 的目标检测。

Object Detection Based on Swin Deformable Transformer-BiPAFPN-YOLOX.

机构信息

School of Mechanical Engineering, Anhui Polytechnic University, Wuhu 241000, China.

出版信息

Comput Intell Neurosci. 2023 Mar 9;2023:4228610. doi: 10.1155/2023/4228610. eCollection 2023.

DOI:10.1155/2023/4228610
PMID:36936669
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10019960/
Abstract

Object detection technology plays a crucial role in people's everyday lives, as well as enterprise production and modern national defense. Most current object detection networks, such as YOLOX, employ convolutional neural networks instead of a Transformer as a backbone. However, these techniques lack a global understanding of the images and may lose meaningful information, such as the precise location of the most active feature detector. Recently, a Transformer with larger receptive fields showed superior performance to corresponding convolutional neural networks in computer vision tasks. The Transformer splits the image into patches and subsequently feeds them to the Transformer in a sequence structure similar to word embeddings. This makes it capable of global modeling of entire images and implies global understanding of images. However, simply using a Transformer with a larger receptive field raises several concerns. For example, self-attention in the Swin Transformer backbone will limit its ability to model long range relations, resulting in poor feature extraction results and low convergence speed during training. To address the above problems, first, we propose an important region-based Reconstructed Deformable Self-Attention that shifts attention to important regions for efficient global modeling. Second, based on the Reconstructed Deformable Self-Attention, we propose the Swin Deformable Transformer backbone, which improves the feature extraction ability and convergence speed. Finally, based on the Swin Deformable Transformer backbone, we propose a novel object detection network, namely, Swin Deformable Transformer-BiPAFPN-YOLOX. experimental results on the COCO dataset show that the training period is reduced by 55.4%, average precision is increased by 2.4%, average precision of small objects is increased by 3.7%, and inference speed is increased by 35%.

摘要

目标检测技术在人们的日常生活、企业生产和现代国防中都起着至关重要的作用。目前大多数目标检测网络,如 YOLOX,都使用卷积神经网络作为骨干,而不是 Transformer。但是这些技术缺乏对图像的全局理解,可能会丢失有意义的信息,例如最活跃的特征检测器的精确位置。最近,具有更大感受野的 Transformer 在计算机视觉任务中表现出优于相应卷积神经网络的性能。Transformer 将图像分割成补丁,然后将它们按类似于词嵌入的序列结构输入到 Transformer 中。这使得它能够对整个图像进行全局建模,并对图像有全局理解。但是,仅仅使用具有更大感受野的 Transformer 会引发一些问题。例如,Swin Transformer 骨干中的自注意力将限制其对长程关系进行建模的能力,从而导致特征提取结果不佳,训练过程中收敛速度较慢。为了解决上述问题,我们首先提出了一种重要的基于区域的重构变形自注意力,将注意力转移到重要区域,以实现高效的全局建模。其次,基于重构变形自注意力,我们提出了 Swin 变形 Transformer 骨干,提高了特征提取能力和收敛速度。最后,基于 Swin 变形 Transformer 骨干,我们提出了一种新颖的目标检测网络,即 Swin 变形 Transformer-BiPAFPN-YOLOX。在 COCO 数据集上的实验结果表明,训练周期减少了 55.4%,平均精度提高了 2.4%,小目标的平均精度提高了 3.7%,推理速度提高了 35%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53a3/10019960/afc3646e5cbe/CIN2023-4228610.014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53a3/10019960/60d2d7d9b19e/CIN2023-4228610.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53a3/10019960/e20d3e1d3af6/CIN2023-4228610.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53a3/10019960/d17c6bcbb88d/CIN2023-4228610.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53a3/10019960/d1cfea149284/CIN2023-4228610.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53a3/10019960/a24effa9e81b/CIN2023-4228610.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53a3/10019960/dfdfe294eb98/CIN2023-4228610.006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53a3/10019960/cc5c18a2610b/CIN2023-4228610.007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53a3/10019960/604565c5f7b2/CIN2023-4228610.008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53a3/10019960/1a3c6951c17d/CIN2023-4228610.009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53a3/10019960/5d4cafcc08a8/CIN2023-4228610.010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53a3/10019960/e2b5f5ba367d/CIN2023-4228610.011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53a3/10019960/a77b0e15c46f/CIN2023-4228610.012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53a3/10019960/366110275218/CIN2023-4228610.013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53a3/10019960/afc3646e5cbe/CIN2023-4228610.014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53a3/10019960/60d2d7d9b19e/CIN2023-4228610.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53a3/10019960/e20d3e1d3af6/CIN2023-4228610.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53a3/10019960/d17c6bcbb88d/CIN2023-4228610.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53a3/10019960/d1cfea149284/CIN2023-4228610.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53a3/10019960/a24effa9e81b/CIN2023-4228610.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53a3/10019960/dfdfe294eb98/CIN2023-4228610.006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53a3/10019960/cc5c18a2610b/CIN2023-4228610.007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53a3/10019960/604565c5f7b2/CIN2023-4228610.008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53a3/10019960/1a3c6951c17d/CIN2023-4228610.009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53a3/10019960/5d4cafcc08a8/CIN2023-4228610.010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53a3/10019960/e2b5f5ba367d/CIN2023-4228610.011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53a3/10019960/a77b0e15c46f/CIN2023-4228610.012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53a3/10019960/366110275218/CIN2023-4228610.013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53a3/10019960/afc3646e5cbe/CIN2023-4228610.014.jpg

相似文献

1
Object Detection Based on Swin Deformable Transformer-BiPAFPN-YOLOX.基于 Swin 变形 Transformer-BiPAFPN-YOLOX 的目标检测。
Comput Intell Neurosci. 2023 Mar 9;2023:4228610. doi: 10.1155/2023/4228610. eCollection 2023.
2
Small object detection algorithm incorporating swin transformer for tea buds.用于茶芽的融合 Swin 变换小目标检测算法。
PLoS One. 2024 Mar 21;19(3):e0299902. doi: 10.1371/journal.pone.0299902. eCollection 2024.
3
A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images.一种基于Swin Transformer的超声图像甲状腺结节检测模型。
J Vis Exp. 2023 Apr 21(194). doi: 10.3791/64480.
4
FEA-Swin: Foreground Enhancement Attention Swin Transformer Network for Accurate UAV-Based Dense Object Detection.FEA-Swin:用于基于无人机的精确密集目标检测的前景增强注意力Swin变压器网络
Sensors (Basel). 2022 Sep 15;22(18):6993. doi: 10.3390/s22186993.
5
Transformer-Based Model with Dynamic Attention Pyramid Head for Semantic Segmentation of VHR Remote Sensing Imagery.基于Transformer且带有动态注意力金字塔头的甚高分辨率遥感影像语义分割模型
Entropy (Basel). 2022 Nov 6;24(11):1619. doi: 10.3390/e24111619.
6
Swin-HSTPS: Research on Target Detection Algorithms for Multi-Source High-Resolution Remote Sensing Images.Swin-HSTPS:多源高分遥感图像目标检测算法研究。
Sensors (Basel). 2021 Dec 4;21(23):8113. doi: 10.3390/s21238113.
7
Enhancing medical image segmentation with a multi-transformer U-Net.使用多变压器U-Net增强医学图像分割
PeerJ. 2024 Feb 29;12:e17005. doi: 10.7717/peerj.17005. eCollection 2024.
8
Multiple Attention Mechanism Enhanced YOLOX for Remote Sensing Object Detection.多注意力机制增强 YOLOX 用于遥感目标检测。
Sensors (Basel). 2023 Jan 22;23(3):1261. doi: 10.3390/s23031261.
9
Swin-Transformer-Based YOLOv5 for Small-Object Detection in Remote Sensing Images.基于 Swin-Transformer 的 YOLOv5 用于遥感图像中的小目标检测。
Sensors (Basel). 2023 Mar 31;23(7):3634. doi: 10.3390/s23073634.
10
Face-based age estimation using improved Swin Transformer with attention-based convolution.基于注意力卷积改进的Swin Transformer的面部年龄估计
Front Neurosci. 2023 Apr 12;17:1136934. doi: 10.3389/fnins.2023.1136934. eCollection 2023.

引用本文的文献

1
Exploring graph-based models for predicting active compounds against triple-negative breast cancer.探索基于图的模型以预测抗三阴性乳腺癌的活性化合物。
Mol Divers. 2025 Jul 9. doi: 10.1007/s11030-025-11283-7.
2
XAI-driven CatBoost multi-layer perceptron neural network for analyzing breast cancer.基于 XAI 的 CatBoost 多层感知机神经网络分析乳腺癌。
Sci Rep. 2024 Nov 19;14(1):28674. doi: 10.1038/s41598-024-79620-8.