• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

重新设计多模态交互:用于免提计算机交互的自适应信号处理与跨模态交互

Redesigning Multimodal Interaction: Adaptive Signal Processing and Cross-Modal Interaction for Hands-Free Computer Interaction.

作者信息

Quan Bui Hong, Anh Nguyen Dinh Tuan, Phi Hoang Van, Thanh Bui Trung

机构信息

Faculty of Information Technology, VNU-University of Engineering and Technology (VNU-UET), Hanoi 10000, Vietnam.

Faculty of Mechanical Engineering, Hung Yen University of Technology and Education, Hungyen 16000, Vietnam.

出版信息

Sensors (Basel). 2025 Sep 2;25(17):5411. doi: 10.3390/s25175411.

DOI:10.3390/s25175411
PMID:40942843
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12431494/
Abstract

Hands-free computer interaction is a key topic in assistive technology, with camera-based and voice-based systems being the most common methods. Recent camera-based solutions leverage facial expressions or head movements to simulate mouse clicks or key presses, while voice-based systems enable control via speech commands, wake-word detection, and vocal gestures. However, existing systems often suffer from limitations in responsiveness and accuracy, especially under real-world conditions. In this paper, we present 3-Modal Human-Computer Interaction (3M-HCI), a novel interaction system that dynamically integrates facial, vocal, and eye-based inputs through a new signal processing pipeline and a cross-modal coordination mechanism. This approach not only enhances recognition accuracy but also reduces interaction latency. Experimental results demonstrate that 3M-HCI outperforms several recent hands-free interaction solutions in both speed and precision, highlighting its potential as a robust assistive interface.

摘要

免提计算机交互是辅助技术中的一个关键主题,基于摄像头和基于语音的系统是最常见的方法。最近基于摄像头的解决方案利用面部表情或头部动作来模拟鼠标点击或按键,而基于语音的系统则通过语音命令、唤醒词检测和语音手势实现控制。然而,现有系统在响应性和准确性方面往往存在局限性,尤其是在现实世界条件下。在本文中,我们提出了三模态人机交互(3M-HCI),这是一种新颖的交互系统,它通过新的信号处理管道和跨模态协调机制动态集成面部、语音和基于眼睛的输入。这种方法不仅提高了识别准确率,还减少了交互延迟。实验结果表明,3M-HCI在速度和精度方面均优于最近的几种免提交互解决方案,凸显了其作为强大辅助界面的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bad5/12431494/377878ab67ad/sensors-25-05411-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bad5/12431494/377878ab67ad/sensors-25-05411-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bad5/12431494/377878ab67ad/sensors-25-05411-g001.jpg

相似文献

1
Redesigning Multimodal Interaction: Adaptive Signal Processing and Cross-Modal Interaction for Hands-Free Computer Interaction.重新设计多模态交互:用于免提计算机交互的自适应信号处理与跨模态交互
Sensors (Basel). 2025 Sep 2;25(17):5411. doi: 10.3390/s25175411.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
Hands-free interaction system using eye tracking for people with physical disabilities.
Disabil Rehabil Assist Technol. 2025 Oct;20(7):2474-2487. doi: 10.1080/17483107.2025.2548012. Epub 2025 Aug 18.
4
Short-Term Memory Impairment短期记忆障碍
5
Enabling by voice: an exploratory study on how interactive smart agents (ISAs) can change the design of environmental control (EC) equipment and service.语音启用:关于交互式智能代理(ISA)如何改变环境控制(EC)设备及服务设计的探索性研究。
Disabil Rehabil Assist Technol. 2025 Jul 23:1-30. doi: 10.1080/17483107.2025.2530195.
6
Touchless interaction with software in interventional radiology and surgery: a systematic literature review.介入放射学与手术中软件的非接触式交互:一项系统的文献综述
Int J Comput Assist Radiol Surg. 2017 Feb;12(2):291-305. doi: 10.1007/s11548-016-1480-6. Epub 2016 Sep 19.
7
Gesture recognition and response system for special education using computer vision and human-computer interaction technology.基于计算机视觉和人机交互技术的特殊教育手势识别与响应系统。
Disabil Rehabil Assist Technol. 2025 Jul 8:1-18. doi: 10.1080/17483107.2025.2527226.
8
Voice-based user interface for hands-free data entry and automation at workplaces.
MethodsX. 2025 Aug 28;15:103596. doi: 10.1016/j.mex.2025.103596. eCollection 2025 Dec.
9
Early stroke diagnosis and evaluation based on pathological voice classification using speech enhancement.基于语音增强的病理性语音分类的早期中风诊断与评估
Comput Biol Med. 2025 Sep;196(Pt C):110940. doi: 10.1016/j.compbiomed.2025.110940. Epub 2025 Aug 16.
10
Post-pandemic planning for maternity care for local, regional, and national maternity systems across the four nations: a mixed-methods study.针对四个地区的地方、区域和国家孕产妇保健系统的疫情后规划:一项混合方法研究。
Health Soc Care Deliv Res. 2025 Sep;13(35):1-25. doi: 10.3310/HHTE6611.

本文引用的文献

1
Amyotrophic lateral sclerosis estimated prevalence cases from 2022 to 2030, data from the national ALS Registry.2022年至2030年肌萎缩侧索硬化症的估计患病率病例,数据来自国家肌萎缩侧索硬化症登记处。
Amyotroph Lateral Scler Frontotemporal Degener. 2025 May;26(3-4):290-295. doi: 10.1080/21678421.2024.2447919. Epub 2025 Jan 3.
2
The global burden of traumatic amputation in 204 countries and territories.204 个国家和地区创伤性截肢的全球负担。
Front Public Health. 2023 Oct 20;11:1258853. doi: 10.3389/fpubh.2023.1258853. eCollection 2023.
3
Evaluating as a computer access system for augmentative and alternative communication in cerebral palsy: A case study.
评估 作为脑瘫患者辅助和替代性沟通的计算机访问系统:案例研究。
Assist Technol. 2024 May 3;36(3):217-223. doi: 10.1080/10400435.2023.2242893. Epub 2023 Sep 12.
4
Experimental Evaluation of EMKEY: An Assistive Technology for People with Upper Limb Disabilities.实验评估 EMKEY:一种上肢残疾人士的辅助技术。
Sensors (Basel). 2023 Apr 17;23(8):4049. doi: 10.3390/s23084049.
5
Low-Cost Human-Machine Interface for Computer Control with Facial Landmark Detection and Voice Commands.基于面部地标检测和语音命令的低成本人机界面,用于计算机控制。
Sensors (Basel). 2022 Nov 29;22(23):9279. doi: 10.3390/s22239279.
6
Touchless Head-Control (THC): Head Gesture Recognition for Cursor and Orientation Control.无接触式头部控制 (THC):用于光标和方向控制的头部手势识别。
IEEE Trans Neural Syst Rehabil Eng. 2022;30:1817-1828. doi: 10.1109/TNSRE.2022.3187472. Epub 2022 Jul 14.
7
IEyeGASE: An Intelligent Eye Gaze-Based Assessment System for Deeper Insights into Learner Performance.IEyeGASE:一种基于智能眼动注视的评估系统,用于更深入了解学习者的表现。
Sensors (Basel). 2021 Oct 13;21(20):6783. doi: 10.3390/s21206783.
8
Development and evaluation of a head-controlled human-computer interface with mouse-like functions for physically disabled users.开发并评估一种具有鼠标类功能的头控式人机接口,供身体残疾用户使用。
Clinics (Sao Paulo). 2009;64(10):975-81. doi: 10.1590/S1807-59322009001000007.
9
The camera mouse: visual tracking of body features to provide computer access for people with severe disabilities.摄像头鼠标:通过视觉追踪身体特征为重度残疾人士提供计算机操作途径。
IEEE Trans Neural Syst Rehabil Eng. 2002 Mar;10(1):1-10. doi: 10.1109/TNSRE.2002.1021581.
10
Application of tilt sensors in human-computer mouse interface for people with disabilities.倾斜传感器在残疾人用的人机鼠标界面中的应用。
IEEE Trans Neural Syst Rehabil Eng. 2001 Sep;9(3):289-94. doi: 10.1109/7333.948457.