文献检索
文档翻译
深度研究
学术资讯

Zotero 插件

邀请有礼
套餐&价格
历史记录

应用&插件

Zotero 插件浏览器插件 Mac 客户端 Windows 客户端微信小程序

定价

高级版会员购买积分包购买API积分包

服务

文献检索文档翻译深度研究 API 文档 MCP 服务

关于我们

关于 Suppr 公司介绍联系我们用户协议隐私条款

关注我们

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

用于对象感官数据的多模态远程感知学习。

Multi-modal remote perception learning for object sensory data.

作者信息

Almujally Nouf Abdullah, Rafique Adnan Ahmed, Al Mudawi Naif, Alazeb Abdulwahab, Alonazi Mohammed, Algarni Asaad, Jalal Ahmad, Liu Hui

机构信息

Department of Information Systems, College of Computer and Information Sciences, Princess Nourah Bint Abdulrahman University, Riyadh, Saudi Arabia.

Department of Computer Science and IT, University of Poonch Rawalakot, Rawalakot, Pakistan.

出版信息

Front Neurorobot. 2024 Sep 19;18:1427786. doi: 10.3389/fnbot.2024.1427786. eCollection 2024.

DOI:10.3389/fnbot.2024.1427786

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11457376/

Abstract

INTRODUCTION

When it comes to interpreting visual input, intelligent systems make use of contextual scene learning, which significantly improves both resilience and context awareness. The management of enormous amounts of data is a driving force behind the growing interest in computational frameworks, particularly in the context of autonomous cars.

METHOD

The purpose of this study is to introduce a novel approach known as Deep Fused Networks (DFN), which improves contextual scene comprehension by merging multi-object detection and semantic analysis.

RESULTS

To enhance accuracy and comprehension in complex situations, DFN makes use of a combination of deep learning and fusion techniques. With a minimum gain of 6.4% in accuracy for the SUN-RGB-D dataset and 3.6% for the NYU-Dv2 dataset.

DISCUSSION

Findings demonstrate considerable enhancements in object detection and semantic analysis when compared to the methodologies that are currently being utilized.

摘要

引言

在解释视觉输入时，智能系统利用上下文场景学习，这显著提高了弹性和上下文感知能力。大量数据的管理是对计算框架兴趣日益增长的驱动力，特别是在自动驾驶汽车的背景下。

方法

本研究的目的是引入一种名为深度融合网络（DFN）的新方法，该方法通过合并多目标检测和语义分析来提高上下文场景理解能力。

结果

为了提高复杂情况下的准确性和理解能力，DFN利用了深度学习和融合技术的组合。对于SUN-RGB-D数据集，准确率至少提高了6.4%，对于NYU-Dv2数据集，准确率至少提高了3.6%。

讨论

研究结果表明，与目前使用的方法相比，可以显著提高目标检测和语义分析能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/536d/11457376/baf9ade0a7b5/fnbot-18-1427786-g010.jpg

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/536d/11457376/93ed062d1603/fnbot-18-1427786-g001.jpg

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/536d/11457376/369ce3d42250/fnbot-18-1427786-g002.jpg

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/536d/11457376/2c16baa70c98/fnbot-18-1427786-g003.jpg

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/536d/11457376/ec0bf9682943/fnbot-18-1427786-g004.jpg

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/536d/11457376/d10262e0dd7d/fnbot-18-1427786-g005.jpg

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/536d/11457376/00c6ad7262b1/fnbot-18-1427786-g006.jpg

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/536d/11457376/1b1bf636ea17/fnbot-18-1427786-g007.jpg

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/536d/11457376/8962b50f32a4/fnbot-18-1427786-g008.jpg

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/536d/11457376/3c2f0d092e1a/fnbot-18-1427786-g009.jpg

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/536d/11457376/baf9ade0a7b5/fnbot-18-1427786-g010.jpg

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/536d/11457376/93ed062d1603/fnbot-18-1427786-g001.jpg

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/536d/11457376/369ce3d42250/fnbot-18-1427786-g002.jpg

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/536d/11457376/2c16baa70c98/fnbot-18-1427786-g003.jpg

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/536d/11457376/ec0bf9682943/fnbot-18-1427786-g004.jpg

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/536d/11457376/d10262e0dd7d/fnbot-18-1427786-g005.jpg

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/536d/11457376/00c6ad7262b1/fnbot-18-1427786-g006.jpg

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/536d/11457376/1b1bf636ea17/fnbot-18-1427786-g007.jpg

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/536d/11457376/8962b50f32a4/fnbot-18-1427786-g008.jpg

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/536d/11457376/3c2f0d092e1a/fnbot-18-1427786-g009.jpg

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/536d/11457376/baf9ade0a7b5/fnbot-18-1427786-g010.jpg

相似文献

1

Multi-modal remote perception learning for object sensory data.用于对象感官数据的多模态远程感知学习。

Front Neurorobot. 2024 Sep 19;18:1427786. doi: 10.3389/fnbot.2024.1427786. eCollection 2024.

2

Remote intelligent perception system for multi-object detection.用于多目标检测的远程智能感知系统

Front Neurorobot. 2024 May 20;18:1398703. doi: 10.3389/fnbot.2024.1398703. eCollection 2024.

3

Cross-Modal Attentional Context Learning for RGB-D Object Detection.跨模态注意上下文学习的 RGB-D 目标检测。

IEEE Trans Image Process. 2019 Apr;28(4):1591-1601. doi: 10.1109/TIP.2018.2878956. Epub 2018 Oct 31.

4

Enhanced Perception for Autonomous Driving Using Semantic and Geometric Data Fusion.利用语义和几何数据融合实现自动驾驶的增强感知。

Sensors (Basel). 2022 Jul 5;22(13):5061. doi: 10.3390/s22135061.

5

A Multi-Modal, Discriminative and Spatially Invariant CNN for RGB-D Object Labeling.一种用于RGB-D目标标注的多模态、判别式且空间不变的卷积神经网络。

IEEE Trans Pattern Anal Mach Intell. 2018 Sep;40(9):2051-2065. doi: 10.1109/TPAMI.2017.2747134. Epub 2017 Aug 30.

6

SLMSF-Net: A Semantic Localization and Multi-Scale Fusion Network for RGB-D Salient Object Detection.SLMSF-Net：用于RGB-D显著目标检测的语义定位与多尺度融合网络

Sensors (Basel). 2024 Feb 8;24(4):1117. doi: 10.3390/s24041117.

7

Semantic segmentation of autonomous driving scenes based on multi-scale adaptive attention mechanism.基于多尺度自适应注意力机制的自动驾驶场景语义分割

Front Neurosci. 2023 Oct 19;17:1291674. doi: 10.3389/fnins.2023.1291674. eCollection 2023.

8

Multi-modal deep learning networks for RGB-D pavement waste detection and recognition.基于多模态深度学习网络的 RGB-D 路面废弃物检测与识别

Waste Manag. 2024 Apr 1;177:125-134. doi: 10.1016/j.wasman.2024.01.047. Epub 2024 Feb 6.

9

RGB-D Object Recognition Using Multi-Modal Deep Neural Network and DS Evidence Theory.基于多模态深度神经网络和证据理论的 RGB-D 目标识别。

Sensors (Basel). 2019 Jan 27;19(3):529. doi: 10.3390/s19030529.

10

Real-Time 3D Multi-Object Detection and Localization Based on Deep Learning for Road and Railway Smart Mobility.基于深度学习的道路和铁路智能交通实时3D多目标检测与定位

J Imaging. 2021 Aug 12;7(8):145. doi: 10.3390/jimaging7080145.

本文引用的文献

1

P2T: Pyramid Pooling Transformer for Scene Understanding.P2T：用于场景理解的金字塔池化变换器

IEEE Trans Pattern Anal Mach Intell. 2023 Nov;45(11):12760-12771. doi: 10.1109/TPAMI.2022.3202765. Epub 2023 Oct 3.

2

Object detection using YOLO: challenges, architectural successors, datasets and applications.使用YOLO进行目标检测：挑战、架构继任者、数据集及应用

Multimed Tools Appl. 2023;82(6):9243-9275. doi: 10.1007/s11042-022-13644-y. Epub 2022 Aug 8.

3

Contextual Translation Embedding for Visual Relationship Detection and Scene Graph Generation.

语境翻译嵌入的视觉关系检测和场景图生成。

IEEE Trans Pattern Anal Mach Intell. 2021 Nov;43(11):3820-3832. doi: 10.1109/TPAMI.2020.2992222. Epub 2021 Oct 1.

4

On the Importance of Visual Context for Data Augmentation in Scene Understanding.视觉上下文在场景理解中对数据增强的重要性

IEEE Trans Pattern Anal Mach Intell. 2021 Jun;43(6):2014-2028. doi: 10.1109/TPAMI.2019.2961896. Epub 2021 May 11.

5

Learning Effective RGB-D Representations for Scene Recognition.学习用于场景识别的有效RGB-D表示。

IEEE Trans Image Process. 2018 Sep 28. doi: 10.1109/TIP.2018.2872629.

6

Object Detection and Classification by Decision-Level Fusion for Intelligent Vehicle Systems.用于智能车辆系统的基于决策级融合的目标检测与分类

Sensors (Basel). 2017 Jan 22;17(1):207. doi: 10.3390/s17010207.

7

Multiresolution bilateral filtering for image denoising.用于图像去噪的多分辨率双边滤波

IEEE Trans Image Process. 2008 Dec;17(12):2324-33. doi: 10.1109/TIP.2008.2006658.