Suppr超能文献

基于表面肌电图的喉切除患者默读语音的韵律偏好

Prosodic Preferences of Surface Electromyography-based Subvocal Speech for People With Laryngectomy.

作者信息

Raiff Laura, Turashvili Dea, Heaton James T, De Luca Gianluca, Kline Joshua C, Vojtech Jenny

机构信息

Delsys, Inc., Natick, Massachusetts 01760; Altec, Inc., Natick, Massachusetts 01760.

Delsys, Inc., Natick, Massachusetts 01760; Altec, Inc., Natick, Massachusetts 01760; Department of Biomedical Engineering, Boston University, Boston, Massachusetts, 02215.

出版信息

J Voice. 2024 Dec 5. doi: 10.1016/j.jvoice.2024.10.024.

Abstract

INTRODUCTION

People who undergo a total laryngectomy lose their natural voice and depend on alaryngeal technologies for communication. However, these technologies are often difficult to use and lack prosody. Surface electromyographic-based silent speech interfaces are novel communication systems that overcome many of the shortcomings of traditional alaryngeal speech and have the potential to seamlessly incorporate individualized prosody. The purpose of this study was to (1) validate the ability of alaryngeal silent speech to effectively incorporate pitch modulations-a key prosodic element in natural speech-into synthesized speech assessed through listening experiments and (2) determine the key features of these communication devices according to core users.

METHODOLOGY

People with laryngectomy (n = 15) and their primary communication partners (n = 5) listened to synthesized sentences with differing prosodic content generated from deep regression neural networks developed in our prior work. Specifically, the fundamental frequency (f) contour of each sentence was manipulated in four ways: (1) flattened to the average f, (2) altered to discrete sentence-level classification of muscle activity, (3) altered to continuous mapping of muscle activity, and (4) filtered to emulate speech from an electrolarynx (EL). Listeners ranked the f contours of each sentence in terms of speech naturalness and the importance of various speech aid features.

RESULTS

Continuous contours rated higher than all other types of contours, and monotonic EL contours rated the lowest. Speech aid features were rated highest to lowest in the following order: sound quality, intelligibility, pitch, delay, volume, hands-free, maintenance, cost, wearability, training, and visibility.

CONCLUSION

These results will help inform future development of silent speech interfaces and shape priorities of communication devices toward the preferences of their users.

摘要

引言

接受全喉切除术的人会失去自然嗓音,依靠人工喉技术进行交流。然而,这些技术通常使用困难且缺乏韵律。基于表面肌电图的无声语音接口是一种新型通信系统,克服了传统人工喉语音的许多缺点,并有可能无缝融入个性化韵律。本研究的目的是:(1)通过听力实验验证人工喉无声语音将音高调制(自然语音中的关键韵律元素)有效融入合成语音的能力;(2)根据核心用户确定这些通信设备的关键特征。

方法

喉切除患者(n = 15)及其主要交流伙伴(n = 5)听取了由我们之前工作中开发的深度回归神经网络生成的具有不同韵律内容的合成句子。具体而言,每个句子的基频(f)轮廓通过四种方式进行处理:(1)平坦化为平均f;(2)改变为肌肉活动的离散句子级分类;(3)改变为肌肉活动的连续映射;(4)滤波以模拟电子喉(EL)的语音。听众根据语音自然度和各种助听功能的重要性对每个句子的f轮廓进行排序。

结果

连续轮廓的评分高于所有其他类型的轮廓,单调的EL轮廓评分最低。助听功能的评分从高到低依次为:音质、可懂度、音高、延迟、音量、免提、维护、成本、可穿戴性、培训和可见性。

结论

这些结果将有助于为无声语音接口的未来发展提供信息,并根据用户偏好确定通信设备的优先事项。

相似文献

2
Surface Electromyography-Based Recognition, Synthesis, and Perception of Prosodic Subvocal Speech.
J Speech Lang Hear Res. 2021 Jun 18;64(6S):2134-2153. doi: 10.1044/2021_JSLHR-20-00257. Epub 2021 May 12.
3
Electromyographic control of a hands-free electrolarynx using neck strap muscles.
J Commun Disord. 2009 May-Jun;42(3):211-25. doi: 10.1016/j.jcomdis.2008.12.002. Epub 2009 Jan 19.
4
The Effect of Clear Speech on Cantonese Alaryngeal Speakers' Intelligibility.
Folia Phoniatr Logop. 2022;74(2):103-111. doi: 10.1159/000517676. Epub 2021 Jul 30.
5
Neck and face surface electromyography for prosthetic voice control after total laryngectomy.
IEEE Trans Neural Syst Rehabil Eng. 2009 Apr;17(2):146-55. doi: 10.1109/TNSRE.2009.2017805. Epub 2009 Mar 16.
6
Long-term average spectral characteristics of Cantonese alaryngeal speech.
Auris Nasus Larynx. 2009 Oct;36(5):571-7. doi: 10.1016/j.anl.2008.12.005. Epub 2009 Mar 3.
7
Silent Speech Recognition as an Alternative Communication Device for Persons with Laryngectomy.
IEEE/ACM Trans Audio Speech Lang Process. 2017 Dec;25(12):2386-2398. doi: 10.1109/TASLP.2017.2740000. Epub 2017 Nov 28.
9
Electrolarynx in voice rehabilitation.
Auris Nasus Larynx. 2007 Sep;34(3):327-32. doi: 10.1016/j.anl.2006.11.010. Epub 2007 Jan 18.
10
Listener impressions of alaryngeal communication modalities.
Int J Speech Lang Pathol. 2021 Oct;23(5):540-547. doi: 10.1080/17549507.2020.1849400. Epub 2021 Jan 27.

本文引用的文献

1
Prediction of Voice Fundamental Frequency and Intensity from Surface Electromyographic Signals of the Face and Neck.
Vibration. 2022 Dec;5(4):692-710. doi: 10.3390/vibration5040041. Epub 2022 Oct 13.
2
Consumer Ratings of the Most Desirable Hearing Aid Attributes.
J Am Acad Audiol. 2021 Sep;32(8):537-546. doi: 10.1055/s-0041-1732442. Epub 2021 Dec 29.
3
Surface Electromyography-Based Recognition, Synthesis, and Perception of Prosodic Subvocal Speech.
J Speech Lang Hear Res. 2021 Jun 18;64(6S):2134-2153. doi: 10.1044/2021_JSLHR-20-00257. Epub 2021 May 12.
5
Speech synthesis from ECoG using densely connected 3D convolutional neural networks.
J Neural Eng. 2019 Jun;16(3):036019. doi: 10.1088/1741-2552/ab0c59. Epub 2019 Mar 4.
6
Communication changes with laryngectomy and impact on quality of life: a review.
Qual Life Res. 2019 Apr;28(4):863-877. doi: 10.1007/s11136-018-2033-y. Epub 2018 Nov 11.
7
Development of sEMG sensors and algorithms for silent speech recognition.
J Neural Eng. 2018 Aug;15(4):046031. doi: 10.1088/1741-2552/aac965. Epub 2018 Jun 1.
8
Silent Speech Recognition as an Alternative Communication Device for Persons with Laryngectomy.
IEEE/ACM Trans Audio Speech Lang Process. 2017 Dec;25(12):2386-2398. doi: 10.1109/TASLP.2017.2740000. Epub 2017 Nov 28.
9
The electrolarynx: voice restoration after total laryngectomy.
Med Devices (Auckl). 2017 Jun 21;10:133-140. doi: 10.2147/MDER.S133225. eCollection 2017.
10
An initial investigation into the real-time conversion of facial surface EMG signals to audible speech.
Annu Int Conf IEEE Eng Med Biol Soc. 2016 Aug;2016:888-891. doi: 10.1109/EMBC.2016.7590843.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验