迈向用于辅助和替代沟通的个性化语音合成。

Towards personalized speech synthesis for augmentative and alternative communication.

作者信息

Mills Timothy, Bunnell H Timothy, Patel Rupal

机构信息

Department of Speech Language Pathology and Audiology, Northeastern University , Boston, MA , USA.

出版信息

Augment Altern Commun. 2014 Sep;30(3):226-36. doi: 10.3109/07434618.2014.924026. Epub 2014 Jul 15.

DOI:10.3109/07434618.2014.924026

PMID:25025818

Abstract

Text-to-speech options on augmentative and alternative communication (AAC) devices are limited. Often, several individuals in a group setting use the same synthetic voice. This lack of customization may limit technology adoption and social integration. This paper describes our efforts to generate personalized synthesis for users with profoundly limited speech motor control. Existing voice banking and voice conversion techniques rely on recordings of clearly articulated speech from the target talker, which cannot be obtained from this population. Our VocaliD approach extracts prosodic properties from the target talker's source function and applies these features to a surrogate talker's database, generating a synthetic voice with the vocal identity of the target talker and the clarity of the surrogate talker. Promising intelligibility results suggest areas of further development for improved personalization.

摘要

辅助和替代沟通（AAC）设备上的文本转语音选项有限。通常，在群体环境中，有几个人会使用相同的合成语音。这种缺乏定制性的情况可能会限制技术的采用和社会融合。本文描述了我们为语音运动控制能力极其有限的用户生成个性化合成语音所做的努力。现有的语音库和语音转换技术依赖于目标说话者清晰发音的语音记录，而这类人群无法提供此类记录。我们的VocaliD方法从目标说话者的源函数中提取韵律特征，并将这些特征应用于替代说话者的数据库，生成具有目标说话者声音特征和替代说话者清晰度的合成语音。有前景的可懂度结果为进一步改进个性化发展指明了方向。

相似文献

Towards personalized speech synthesis for augmentative and alternative communication.

Augment Altern Commun. 2014 Sep;30(3):226-36. doi: 10.3109/07434618.2014.924026. Epub 2014 Jul 15.

Voice banking for people living with motor neurone disease: Views and expectations.

Int J Lang Commun Disord. 2021 Jan;56(1):116-129. doi: 10.1111/1460-6984.12588. Epub 2020 Dec 22.

A recent survey of augmentative and alternative communication use and service delivery experiences of people with amyotrophic lateral sclerosis in the United States.

Disabil Rehabil Assist Technol. 2024 May;19(4):1121-1134. doi: 10.1080/17483107.2022.2149866. Epub 2022 Nov 30.

Voice Conversion for Persons with Amyotrophic Lateral Sclerosis.

IEEE J Biomed Health Inform. 2020 Oct;24(10):2942-2949. doi: 10.1109/JBHI.2019.2961844. Epub 2019 Dec 25.

Optimizing Communication in Ataxia: A Multifaceted Approach to Alternative and Augmentative Communication (AAC).

Cerebellum. 2024 Oct;23(5):2142-2151. doi: 10.1007/s12311-024-01675-0. Epub 2024 Mar 7.

Joint population coding and temporal coherence link an attended talker's voice and location features in naturalistic multi-talker scenes.

bioRxiv. 2025 Feb 12:2024.05.13.593814. doi: 10.1101/2024.05.13.593814.

Perceptions of the design of voice output communication aids.

Int J Lang Commun Disord. 2013 Jul-Aug;48(4):366-81. doi: 10.1111/1460-6984.12012. Epub 2013 Apr 17.

Native voice, self-concept and the moral case for personalized voice technology.

Disabil Rehabil. 2017 Jan;39(1):73-81. doi: 10.3109/09638288.2016.1139193. Epub 2016 Feb 16.

A large-scale comparison of two voice synthesis techniques on intelligibility, naturalness, preferences, and attitudes toward voices banked by individuals with amyotrophic lateral sclerosis.

Augment Altern Commun. 2024 Mar;40(1):31-45. doi: 10.1080/07434618.2023.2262032. Epub 2023 Oct 4.

Rehabilitation of communication impairment in dystonia musculorum deformans.

Pediatr Neurol. 1987 Mar-Apr;3(2):97-100. doi: 10.1016/0887-8994(87)90036-1.

引用本文的文献

Do you like my voice? Stakeholder perspectives about the acceptability of synthetic child voices in three South African languages.

Int J Lang Commun Disord. 2025 Jan-Feb;60(1):e13152. doi: 10.1111/1460-6984.13152.

Optimizing Communication in Ataxia: A Multifaceted Approach to Alternative and Augmentative Communication (AAC).

Cerebellum. 2024 Oct;23(5):2142-2151. doi: 10.1007/s12311-024-01675-0. Epub 2024 Mar 7.

AAC and Artificial Intelligence (AI).

Top Lang Disord. 2019 Oct-Dec;39(4):389-403. doi: 10.1097/tld.0000000000000197.

Communication Matters-Pitfalls and Promise of Hightech Communication Devices in Palliative Care of Severely Physically Disabled Patients With Amyotrophic Lateral Sclerosis.

Front Neurol. 2018 Jul 27;9:603. doi: 10.3389/fneur.2018.00603. eCollection 2018.

Silent Speech Recognition as an Alternative Communication Device for Persons with Laryngectomy.

IEEE/ACM Trans Audio Speech Lang Process. 2017 Dec;25(12):2386-2398. doi: 10.1109/TASLP.2017.2740000. Epub 2017 Nov 28.

DAVID: An open-source platform for real-time transformation of infra-segmental emotional cues in running speech.

Behav Res Methods. 2018 Feb;50(1):323-343. doi: 10.3758/s13428-017-0873-y.

Supportive and symptomatic management of amyotrophic lateral sclerosis.

Nat Rev Neurol. 2016 Sep;12(9):526-38. doi: 10.1038/nrneurol.2016.111. Epub 2016 Aug 12.

What is the Value of Embedding Artificial Emotional Prosody in Human-Computer Interactions? Implications for Theory and Design in Psychological Science.

Front Psychol. 2015 Nov 12;6:1750. doi: 10.3389/fpsyg.2015.01750. eCollection 2015.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

迈向用于辅助和替代沟通的个性化语音合成。

Towards personalized speech synthesis for augmentative and alternative communication.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献