• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于表面肌电图的韵律性默读语音识别、合成与感知

Surface Electromyography-Based Recognition, Synthesis, and Perception of Prosodic Subvocal Speech.

作者信息

Vojtech Jennifer M, Chan Michael D, Shiwani Bhawna, Roy Serge H, Heaton James T, Meltzner Geoffrey S, Contessa Paola, De Luca Gianluca, Patel Rupal, Kline Joshua C

机构信息

Delsys/Altec, Inc., Natick, MA.

Boston University, MA.

出版信息

J Speech Lang Hear Res. 2021 Jun 18;64(6S):2134-2153. doi: 10.1044/2021_JSLHR-20-00257. Epub 2021 May 12.

DOI:10.1044/2021_JSLHR-20-00257
PMID:33979177
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8740708/
Abstract

Purpose This study aimed to evaluate a novel communication system designed to translate surface electromyographic (sEMG) signals from articulatory muscles into speech using a personalized, digital voice. The system was evaluated for word recognition, prosodic classification, and listener perception of synthesized speech. Method sEMG signals were recorded from the face and neck as speakers with ( = 4) and without ( = 4) laryngectomy subvocally recited (silently mouthed) a speech corpus comprising 750 phrases (150 phrases with variable phrase-level stress). Corpus tokens were then translated into speech via personalized voice synthesis ( = 8 synthetic voices) and compared against phrases produced by each speaker when using their typical mode of communication ( = 4 natural voices,  = 4 electrolaryngeal [EL] voices). Naïve listeners ( = 12) evaluated synthetic, natural, and EL speech for acceptability and intelligibility in a visual sort-and-rate task, as well as phrasal stress discriminability via a classification mechanism. Results Recorded sEMG signals were processed to translate sEMG muscle activity into lexical content and categorize variations in phrase-level stress, achieving a mean accuracy of 96.3% ( = 3.10%) and 91.2% ( = 4.46%), respectively. Synthetic speech was significantly higher in acceptability and intelligibility than EL speech, also leading to greater phrasal stress classification accuracy, whereas natural speech was rated as the most acceptable and intelligible, with the greatest phrasal stress classification accuracy. Conclusion This proof-of-concept study establishes the feasibility of using subvocal sEMG-based alternative communication not only for lexical recognition but also for prosodic communication in healthy individuals, as well as those living with vocal impairments and residual articulatory function. Supplemental Material https://doi.org/10.23641/asha.14558481.

摘要

目的 本研究旨在评估一种新型通信系统,该系统旨在使用个性化数字语音将来自发音肌肉的表面肌电图(sEMG)信号转换为语音。对该系统进行了单词识别、韵律分类以及听众对合成语音的感知方面的评估。方法 当有喉切除术的说话者(n = 4)和无喉切除术的说话者(n = 4)默读(不出声地口念)包含750个短语(150个具有可变短语级重音的短语)的语音语料库时,从面部和颈部记录sEMG信号。然后通过个性化语音合成(n = 8个合成语音)将语料库中的标记转换为语音,并与每个说话者在使用其典型通信模式时产生的短语进行比较(n = 4个自然语音,n = 4个电子喉[EL]语音)。未受过训练的听众(n = 12)在视觉分类和评分任务中评估合成语音、自然语音和EL语音的可接受性和可理解性,以及通过分类机制评估短语重音的可辨别性。结果 对记录的sEMG信号进行处理,将sEMG肌肉活动转换为词汇内容,并对短语级重音的变化进行分类,平均准确率分别达到96.3%(标准差 = 3.10%)和91.2%(标准差 = 4.46%)。合成语音在可接受性和可理解性方面显著高于EL语音,也导致更高的短语重音分类准确率,而自然语音被评为最可接受和最可理解的,具有最高的短语重音分类准确率。结论 这项概念验证研究证明了使用基于默读sEMG的替代通信不仅对于健康个体,而且对于有语音障碍和残留发音功能的个体进行词汇识别和韵律通信的可行性。补充材料 https://doi.org/10.23641/asha.14558481 。

相似文献

1
Surface Electromyography-Based Recognition, Synthesis, and Perception of Prosodic Subvocal Speech.基于表面肌电图的韵律性默读语音识别、合成与感知
J Speech Lang Hear Res. 2021 Jun 18;64(6S):2134-2153. doi: 10.1044/2021_JSLHR-20-00257. Epub 2021 May 12.
2
Silent Speech Recognition as an Alternative Communication Device for Persons with Laryngectomy.无声语音识别作为喉切除患者的替代交流设备
IEEE/ACM Trans Audio Speech Lang Process. 2017 Dec;25(12):2386-2398. doi: 10.1109/TASLP.2017.2740000. Epub 2017 Nov 28.
3
Development of sEMG sensors and algorithms for silent speech recognition.用于无声语音识别的表面肌电传感器和算法的开发。
J Neural Eng. 2018 Aug;15(4):046031. doi: 10.1088/1741-2552/aac965. Epub 2018 Jun 1.
4
Pilot study for a novel and personalized voice restoration device for patients with laryngectomy.一项针对喉切除患者的新型个性化语音恢复装置的试点研究。
Head Neck. 2020 May;42(5):839-845. doi: 10.1002/hed.26057. Epub 2019 Dec 26.
5
The relationship between communicative participation and postlaryngectomy speech outcomes.喉切除术后交流参与度与言语结果之间的关系。
Head Neck. 2016 Apr;38 Suppl 1(Suppl 1):E1955-61. doi: 10.1002/hed.24353. Epub 2015 Dec 29.
6
Neck and face surface electromyography for prosthetic voice control after total laryngectomy.全喉切除术后用于假体语音控制的颈部和面部表面肌电图
IEEE Trans Neural Syst Rehabil Eng. 2009 Apr;17(2):146-55. doi: 10.1109/TNSRE.2009.2017805. Epub 2009 Mar 16.
7
Design and Preliminary Evaluation of Electrolarynx With F0 Control Based on Capacitive Touch Technology.基于电容触摸技术的带基频控制的电声喉的设计与初步评估。
IEEE Trans Neural Syst Rehabil Eng. 2018 Mar;26(3):629-636. doi: 10.1109/TNSRE.2018.2805338.
8
The Effects of Dysphonic Voice on Speech Intelligibility in Cantonese-Speaking Adults.发声障碍的嗓音对说粤语成年人言语可懂度的影响。
J Speech Lang Hear Res. 2021 Jan 14;64(1):16-29. doi: 10.1044/2020_JSLHR-19-00190. Epub 2020 Dec 11.
9
Effect of Dysphonia and Cognitive-Perceptual Listener Strategies on Speech Intelligibility.嗓音障碍和认知感知听众策略对言语清晰度的影响。
J Voice. 2020 Sep;34(5):806.e7-806.e18. doi: 10.1016/j.jvoice.2019.03.013. Epub 2019 Apr 25.
10
Characteristics of Japanese Electrolaryngeal Speech Produced by Untrained Speakers: An Observational Study Involving Healthy Volunteers.未经训练的日本人使用电子喉发声的特点:一项涉及健康志愿者的观察性研究。
J Speech Lang Hear Res. 2021 Oct 4;64(10):3786-3793. doi: 10.1044/2021_JSLHR-21-00069. Epub 2021 Sep 21.

引用本文的文献

1
Prosodic Preferences of Surface Electromyography-based Subvocal Speech for People With Laryngectomy.基于表面肌电图的喉切除患者默读语音的韵律偏好
J Voice. 2024 Dec 5. doi: 10.1016/j.jvoice.2024.10.024.
2
Identification of the Biomechanical Response of the Muscles That Contract the Most during Disfluencies in Stuttered Speech.口吃言语不流畅时收缩最强烈的肌肉的生物力学反应的识别。
Sensors (Basel). 2024 Apr 20;24(8):2629. doi: 10.3390/s24082629.
3
Prediction of Voice Fundamental Frequency and Intensity from Surface Electromyographic Signals of the Face and Neck.基于面部和颈部表面肌电信号预测语音基频和强度
Vibration. 2022 Dec;5(4):692-710. doi: 10.3390/vibration5040041. Epub 2022 Oct 13.

本文引用的文献

1
Visual Analog Scale Ratings and Orthographic Transcription Measures of Sentence Intelligibility in Parkinson's Disease With Variable Listener Exposure.帕金森病患者句子可懂度的视觉模拟评分和正字法转录测量,受听众暴露程度影响。
Am J Speech Lang Pathol. 2019 Aug 9;28(3):1222-1232. doi: 10.1044/2019_AJSLP-18-0275. Epub 2019 Jul 11.
2
Sense of Effort and Fatigue Associated With Talking After Total Laryngectomy.全喉切除术后说话的费力感和疲劳感。
Am J Speech Lang Pathol. 2018 Nov 21;27(4):1434-1444. doi: 10.1044/2018_AJSLP-17-0218.
3
Speaker-Independent Silent Speech Recognition from Flesh-Point Articulatory Movements Using an LSTM Neural Network.基于LSTM神经网络的、利用肤点发音动作的独立说话人无声语音识别
IEEE/ACM Trans Audio Speech Lang Process. 2017 Dec;25(12):2323-2336. doi: 10.1109/TASLP.2017.2758999. Epub 2017 Nov 23.
4
The influence of clear speech on auditory-perceptual judgments of electrolaryngeal speech.清晰言语对电子喉言语听觉感知判断的影响。
J Commun Disord. 2018 Sep-Oct;75:25-36. doi: 10.1016/j.jcomdis.2018.06.003. Epub 2018 Jun 19.
5
Prediction of Optimal Facial Electromyographic Sensor Configurations for Human-Machine Interface Control.预测用于人机界面控制的最佳面部肌电传感器配置。
IEEE Trans Neural Syst Rehabil Eng. 2018 Aug;26(8):1566-1576. doi: 10.1109/TNSRE.2018.2849202. Epub 2018 Jun 20.
6
Development of sEMG sensors and algorithms for silent speech recognition.用于无声语音识别的表面肌电传感器和算法的开发。
J Neural Eng. 2018 Aug;15(4):046031. doi: 10.1088/1741-2552/aac965. Epub 2018 Jun 1.
7
Silent Speech Recognition as an Alternative Communication Device for Persons with Laryngectomy.无声语音识别作为喉切除患者的替代交流设备
IEEE/ACM Trans Audio Speech Lang Process. 2017 Dec;25(12):2386-2398. doi: 10.1109/TASLP.2017.2740000. Epub 2017 Nov 28.
8
Extracting time-frequency feature of single-channel vastus medialis EMG signals for knee exercise pattern recognition.提取单通道股内侧肌肌电信号的时频特征用于膝关节运动模式识别。
PLoS One. 2017 Jul 10;12(7):e0180526. doi: 10.1371/journal.pone.0180526. eCollection 2017.
9
Designing interaction, voice, and inclusion in AAC research.设计 AAC 研究中的交互、语音和包容性。
Augment Altern Commun. 2017 Sep;33(3):139-148. doi: 10.1080/07434618.2017.1342690. Epub 2017 Jul 4.
10
Predicting 3D lip shapes using facial surface EMG.使用面部表面肌电图预测三维嘴唇形状。
PLoS One. 2017 Apr 13;12(4):e0175025. doi: 10.1371/journal.pone.0175025. eCollection 2017.