Mills Timothy, Bunnell H Timothy, Patel Rupal
Department of Speech Language Pathology and Audiology, Northeastern University , Boston, MA , USA.
Augment Altern Commun. 2014 Sep;30(3):226-36. doi: 10.3109/07434618.2014.924026. Epub 2014 Jul 15.
Text-to-speech options on augmentative and alternative communication (AAC) devices are limited. Often, several individuals in a group setting use the same synthetic voice. This lack of customization may limit technology adoption and social integration. This paper describes our efforts to generate personalized synthesis for users with profoundly limited speech motor control. Existing voice banking and voice conversion techniques rely on recordings of clearly articulated speech from the target talker, which cannot be obtained from this population. Our VocaliD approach extracts prosodic properties from the target talker's source function and applies these features to a surrogate talker's database, generating a synthetic voice with the vocal identity of the target talker and the clarity of the surrogate talker. Promising intelligibility results suggest areas of further development for improved personalization.
辅助和替代沟通(AAC)设备上的文本转语音选项有限。通常,在群体环境中,有几个人会使用相同的合成语音。这种缺乏定制性的情况可能会限制技术的采用和社会融合。本文描述了我们为语音运动控制能力极其有限的用户生成个性化合成语音所做的努力。现有的语音库和语音转换技术依赖于目标说话者清晰发音的语音记录,而这类人群无法提供此类记录。我们的VocaliD方法从目标说话者的源函数中提取韵律特征,并将这些特征应用于替代说话者的数据库,生成具有目标说话者声音特征和替代说话者清晰度的合成语音。有前景的可懂度结果为进一步改进个性化发展指明了方向。