Suppr超能文献

用于人机交互的多模态感知驱动决策:一项综述。

Multimodal perception-driven decision-making for human-robot interaction: a survey.

作者信息

Zhao Wenzheng, Gangaraju Kruthika, Yuan Fengpei

机构信息

Department of Robotics Engineering, Worcester Polytechnic Institute, Worcester, MA, United States.

出版信息

Front Robot AI. 2025 Aug 22;12:1604472. doi: 10.3389/frobt.2025.1604472. eCollection 2025.

Abstract

Multimodal perception is essential for enabling robots to understand and interact with complex environments and human users by integrating diverse sensory data, such as vision, language, and tactile information. This capability plays a crucial role in decision-making in dynamic, complex environments. This survey provides a comprehensive review of advancements in multimodal perception and its integration with decision-making in robotics from year 2004-2024. We systematically summarize existing multimodal perception-driven decision-making (MPDDM) frameworks, highlighting their advantages in dynamic environments and the methodologies employed in human-robot interaction (HRI). Beyond reviewing these frameworks, we analyze key challenges in multimodal perception and decision-making, focusing on technical integration and sensor noise, adaptation, domain generalization, and safety and robustness. Finally, we outline future research directions, emphasizing the need for adaptive multimodal fusion techniques, more efficient learning paradigms, and human-trusted decision-making frameworks to advance the HRI field.

摘要

多模态感知对于使机器人能够通过整合视觉、语言和触觉信息等多种感官数据来理解复杂环境并与人类用户进行交互至关重要。这种能力在动态复杂环境中的决策过程中起着关键作用。本综述全面回顾了2004年至2024年多模态感知及其与机器人决策整合方面的进展。我们系统地总结了现有的多模态感知驱动决策(MPDDM)框架,突出了它们在动态环境中的优势以及人机交互(HRI)中所采用的方法。除了回顾这些框架,我们还分析了多模态感知和决策中的关键挑战,重点关注技术整合、传感器噪声、适应性、领域泛化以及安全性和鲁棒性。最后,我们概述了未来的研究方向,强调需要自适应多模态融合技术、更高效的学习范式以及人类信任的决策框架,以推动人机交互领域的发展。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22c7/12411148/4bfe8d15100f/frobt-12-1604472-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验