Department of Linguistics, University of California, Davis, 469 Kerr Hall, One Shields Avenue, Davis, California 95616, USA
JASA Express Lett. 2022 Apr;2(4):045204. doi: 10.1121/10.0010274.
This study examined how speaking style and guise influence the intelligibility of text-to-speech (TTS) and naturally produced human voices. Results showed that TTS voices were less intelligible overall. Although using a clear speech style improved intelligibility for both human and TTS voices (using "newscaster" neural TTS), the clear speech effect was stronger for TTS voices. Finally, a visual device guise decreased intelligibility, regardless of voice type. The results suggest that both speaking style and visual guise affect intelligibility of human and TTS voices. Findings are discussed in terms of theories about the role of social information in speech perception.
本研究考察了说话风格和伪装对文本转语音(TTS)和自然产生的人类声音的可理解性的影响。结果表明,TTS 声音的整体可理解性较低。虽然使用清晰的说话风格可以提高人类和 TTS 声音的可理解性(使用“新闻主播”神经 TTS),但清晰说话风格对 TTS 声音的影响更强。最后,无论声音类型如何,视觉设备伪装都会降低可理解性。这些结果表明,说话风格和视觉伪装都会影响人类和 TTS 声音的可理解性。研究结果根据关于社会信息在言语感知中的作用的理论进行了讨论。