文献检索，用中文搜 PubMed

The human voice stands out for its rich information transmission capabilities. However, voice communication is susceptible to interference from noisy environments and obstacles. Here, we propose a wearable wireless flexible skin-attached acoustic sensor (SAAS) capable of capturing the vibrations of vocal organs and skin movements, thereby enabling voice recognition and human-machine interaction (HMI) in harsh acoustic environments. This system utilizes a piezoelectric micromachined ultrasonic transducers (PMUT), which feature high sensitivity (-198 dB), wide bandwidth (10 Hz-20 kHz), and excellent flatness (±0.5 dB). Flexible packaging enhances comfort and adaptability during wear, while integration with the Residual Network (ResNet) architecture significantly improves the classification of laryngeal speech features, achieving an accuracy exceeding 96%. Furthermore, we also demonstrated SAAS's data collection and intelligent classification capabilities in multiple HMI scenarios. Finally, the speech recognition system was able to recognize everyday sentences spoken by participants with an accuracy of 99.8% through a deep learning model. With advantages including a simple fabrication process, stable performance, easy integration, and low cost, SAAS presents a compelling solution for applications in voice control, HMI, and wearable electronics.

Machine learning-assisted wearable sensing systems for speech recognition and interaction.

作者信息

Liu Tao, Zhang Mingyang, Li Zhihao, Dou Hanjie, Zhang Wangyang, Yang Jiaqian, Wu Pengfan, Li Dongxiao, Mu Xiaojing

机构信息

Key Laboratory of Optoelectronic Technology & Systems of Ministry of Education, International R & D Center of Micro-nano Systems and New Materials Technology, Chongqing University, Chongqing, 400044, China.

出版信息

Nat Commun. 2025 Mar 10;16(1):2363. doi: 10.1038/s41467-025-57629-5.

人类语音以其丰富的信息传输能力脱颖而出。然而，语音通信容易受到嘈杂环境和障碍物的干扰。在此，我们提出了一种可穿戴式无线柔性贴肤声学传感器（SAAS），它能够捕捉发声器官的振动和皮肤运动，从而在恶劣声学环境中实现语音识别和人机交互（HMI）。该系统采用了压电微机电超声换能器（PMUT），其具有高灵敏度（-198 dB）、宽带宽（10 Hz - 20 kHz）和出色的平坦度（±0.5 dB）。柔性封装提高了佩戴时的舒适度和适应性，而与残差网络（ResNet）架构集成显著改善了喉部语音特征的分类，准确率超过96%。此外，我们还展示了SAAS在多种人机交互场景中的数据收集和智能分类能力。最后，语音识别系统通过深度学习模型能够以99.8%的准确率识别参与者说出的日常语句。SAAS具有制造工艺简单、性能稳定、易于集成和成本低等优点，为语音控制、人机交互和可穿戴电子设备的应用提供了极具吸引力的解决方案。