Suppr超能文献

基于多模态深度学习网络的 RGB-D 路面废弃物检测与识别

Multi-modal deep learning networks for RGB-D pavement waste detection and recognition.

机构信息

School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, Shaanxi, China.

出版信息

Waste Manag. 2024 Apr 1;177:125-134. doi: 10.1016/j.wasman.2024.01.047. Epub 2024 Feb 6.

Abstract

To create a clean living environment, governments around the world have hired a large number of workers to clean up waste on pavements, which is inefficient for waste management. To better alleviate this problem, relevant scholars have proposed several deep learning methods based on RGB images to achieve waste detection and recognition. Considering the limitations of color images, we propose an efficient multi-modal learning solution for pavement waste detection and recognition. Specifically, we construct a high-quality outdoor pavement waste dataset called OPWaste, which is more in line with real needs. Compared to other waste datasets, OPWaste dataset not only has the advantages of rich background and high diversity, but also provides color and depth images. Meanwhile, we explore six different multi-modal fusion methods and propose a novel multi-modal multi-scale network (MM-Net) for RGB-D waste detection and recognition. MM-Net introduces a novel multi-scale refinement module (MRM) and multi-scale interaction module (MIM). MRM can effectively refine critical features using attention mechanisms. MIM can gradually realize information interaction between hierarchical features. In addition, we select several representative methods and perform comparative experiments. Experimental results show that MM-Net based on the image addition fusion method outperforms other deep learning models and reaches 97.3% and 84.4% on mAP and AR metrics. In fact, multi-modal learning plays an important role in intelligent waste recycling. As a promising auxiliary tool, our solution can be applied to intelligent cleaning robots for automatic outdoor waste management.

摘要

为了创造一个清洁的生活环境,世界各国政府已经雇佣了大量工人清理人行道上的垃圾,但这种方式在垃圾管理方面效率低下。为了更好地缓解这个问题,相关学者提出了几种基于 RGB 图像的深度学习方法,以实现废物检测和识别。考虑到彩色图像的局限性,我们提出了一种高效的多模态学习解决方案,用于路面废物检测和识别。具体来说,我们构建了一个名为 OPWaste 的高质量户外路面废物数据集,它更符合实际需求。与其他废物数据集相比,OPWaste 数据集不仅具有丰富背景和高度多样性的优势,还提供了彩色和深度图像。同时,我们探索了六种不同的多模态融合方法,并提出了一种新颖的 RGB-D 废物检测和识别的多模态多尺度网络(MM-Net)。MM-Net 引入了一种新颖的多尺度细化模块(MRM)和多尺度交互模块(MIM)。MRM 可以使用注意力机制有效地细化关键特征。MIM 可以逐步实现分层特征之间的信息交互。此外,我们选择了几种有代表性的方法进行对比实验。实验结果表明,基于图像相加融合方法的 MM-Net 优于其他深度学习模型,在 mAP 和 AR 指标上分别达到 97.3%和 84.4%。事实上,多模态学习在智能废物回收中起着重要作用。作为一种有前途的辅助工具,我们的解决方案可以应用于智能清洁机器人,实现自动户外废物管理。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验