• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于注意力 LSTM 的具有时间身份感知能力的 SSD。

Temporally Identity-Aware SSD With Attentional LSTM.

出版信息

IEEE Trans Cybern. 2020 Jun;50(6):2674-2686. doi: 10.1109/TCYB.2019.2894261. Epub 2019 Feb 11.

DOI:10.1109/TCYB.2019.2894261
PMID:30762576
Abstract

Temporal object detection has attracted significant attention, but most popular detection methods cannot leverage rich temporal information in videos. Very recently, many algorithms have been developed for video detection task, yet very few approaches can achieve real-time online object detection in videos. In this paper, based on the attention mechanism and convolutional long short-term memory (ConvLSTM), we propose a temporal single-shot detector (TSSD) for real-world detection. Distinct from the previous methods, we take aim at temporally integrating pyramidal feature hierarchy using ConvLSTM, and design a novel structure, including a low-level temporal unit as well as a high-level one for multiscale feature maps. Moreover, we develop a creative temporal analysis unit, namely, attentional ConvLSTM, in which a temporal attention mechanism is specially tailored for background suppression and scale suppression, while a ConvLSTM integrates attention-aware features across time. An association loss and a multistep training are designed for temporal coherence. Besides, an online tubelet analysis (OTA) is exploited for identification. Our framework is evaluated on ImageNet VID dataset and 2DMOT15 dataset. Extensive comparisons on the detection and tracking capability validate the superiority of the proposed approach. Consequently, the developed TSSD-OTA achieves a fast speed and an overall competitive performance in terms of detection and tracking. Finally, a real-world maneuver is conducted for underwater object grasping.

摘要

时间目标检测引起了广泛关注,但大多数流行的检测方法无法利用视频中的丰富时间信息。最近,已经开发出许多用于视频检测任务的算法,但很少有方法可以实现视频中的实时在线目标检测。在本文中,我们基于注意力机制和卷积长短期记忆(ConvLSTM),为现实世界的检测提出了一种时间单镜头检测器(TSSD)。与以往的方法不同,我们旨在使用 ConvLSTM 对金字塔特征层次结构进行时间上的集成,并设计了一种新颖的结构,包括用于多尺度特征图的低水平时间单元和高水平时间单元。此外,我们开发了一种创造性的时间分析单元,即注意 ConvLSTM,其中时间注意力机制专门用于背景抑制和尺度抑制,而 ConvLSTM 则跨时间集成注意感知特征。设计了关联损失和多步训练来实现时间一致性。此外,还利用在线小管分析(OTA)进行识别。我们的框架在 ImageNet VID 数据集和 2DMOT15 数据集上进行了评估。在检测和跟踪能力方面的广泛比较验证了所提出方法的优越性。因此,所提出的 TSSD-OTA 在检测和跟踪方面实现了快速速度和整体竞争力。最后,进行了水下物体抓取的实际操作。

相似文献

1
Temporally Identity-Aware SSD With Attentional LSTM.基于注意力 LSTM 的具有时间身份感知能力的 SSD。
IEEE Trans Cybern. 2020 Jun;50(6):2674-2686. doi: 10.1109/TCYB.2019.2894261. Epub 2019 Feb 11.
2
Deep Temporal Model-Based Identity-Aware Hand Detection for Space Human-Robot Interaction.基于深度时间模型的空间人机交互中身份感知的手检测
IEEE Trans Cybern. 2022 Dec;52(12):13738-13751. doi: 10.1109/TCYB.2021.3114031. Epub 2022 Nov 18.
3
Weakly supervised convolutional LSTM approach for tool tracking in laparoscopic videos.基于弱监督卷积 LSTM 的腹腔镜视频中工具跟踪方法。
Int J Comput Assist Radiol Surg. 2019 Jun;14(6):1059-1067. doi: 10.1007/s11548-019-01958-6. Epub 2019 Apr 9.
4
Deep Spatial-Temporal Joint Feature Representation for Video Object Detection.用于视频目标检测的深度时空联合特征表示。
Sensors (Basel). 2018 Mar 4;18(3):774. doi: 10.3390/s18030774.
5
Object Detection in Videos by High Quality Object Linking.通过高质量对象链接实现视频中的目标检测
IEEE Trans Pattern Anal Mach Intell. 2020 May;42(5):1272-1278. doi: 10.1109/TPAMI.2019.2910529. Epub 2019 Apr 11.
6
Attention-Guided Disentangled Feature Aggregation for Video Object Detection.面向视频目标检测的注意力引导解缠特征聚合。
Sensors (Basel). 2022 Nov 7;22(21):8583. doi: 10.3390/s22218583.
7
Redundancy and Attention in Convolutional LSTM for Gesture Recognition.用于手势识别的卷积长短期记忆网络中的冗余与注意力机制
IEEE Trans Neural Netw Learn Syst. 2019 Jun 28. doi: 10.1109/TNNLS.2019.2919764.
8
SV-RCNet: Workflow Recognition From Surgical Videos Using Recurrent Convolutional Network.SV-RCNet:基于递归卷积网络的手术视频工作流程识别
IEEE Trans Med Imaging. 2018 May;37(5):1114-1126. doi: 10.1109/TMI.2017.2787657.
9
Spherical DNNs and Their Applications in 360 Images and Videos.球形 DNN 及其在 360 度图像和视频中的应用。
IEEE Trans Pattern Anal Mach Intell. 2022 Oct;44(10):7235-7252. doi: 10.1109/TPAMI.2021.3100259. Epub 2022 Sep 14.
10
MINet: Meta-Learning Instance Identifiers for Video Object Detection.MINet:用于视频目标检测的元学习实例标识符
IEEE Trans Image Process. 2021;30:6879-6891. doi: 10.1109/TIP.2021.3099409. Epub 2021 Aug 4.

引用本文的文献

1
Backlight and dim space object detection based on a novel event camera.基于新型事件相机的背光和暗空间物体检测
PeerJ Comput Sci. 2024 Jul 12;10:e2192. doi: 10.7717/peerj-cs.2192. eCollection 2024.
2
Research Challenges, Recent Advances, and Popular Datasets in Deep Learning-Based Underwater Marine Object Detection: A Review.深度学习在水下海洋目标检测中的研究挑战、最新进展和流行数据集:综述。
Sensors (Basel). 2023 Feb 10;23(4):1990. doi: 10.3390/s23041990.