• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

清晰语音对语音合成语音的可懂度增益:说话风格和视觉伪装的影响。

The clear speech intelligibility benefit for text-to-speech voices: Effects of speaking style and visual guise.

机构信息

Department of Linguistics, University of California, Davis, 469 Kerr Hall, One Shields Avenue, Davis, California 95616, USA

出版信息

JASA Express Lett. 2022 Apr;2(4):045204. doi: 10.1121/10.0010274.

DOI:10.1121/10.0010274
PMID:36154231
Abstract

This study examined how speaking style and guise influence the intelligibility of text-to-speech (TTS) and naturally produced human voices. Results showed that TTS voices were less intelligible overall. Although using a clear speech style improved intelligibility for both human and TTS voices (using "newscaster" neural TTS), the clear speech effect was stronger for TTS voices. Finally, a visual device guise decreased intelligibility, regardless of voice type. The results suggest that both speaking style and visual guise affect intelligibility of human and TTS voices. Findings are discussed in terms of theories about the role of social information in speech perception.

摘要

本研究考察了说话风格和伪装对文本转语音(TTS)和自然产生的人类声音的可理解性的影响。结果表明,TTS 声音的整体可理解性较低。虽然使用清晰的说话风格可以提高人类和 TTS 声音的可理解性(使用“新闻主播”神经 TTS),但清晰说话风格对 TTS 声音的影响更强。最后,无论声音类型如何,视觉设备伪装都会降低可理解性。这些结果表明,说话风格和视觉伪装都会影响人类和 TTS 声音的可理解性。研究结果根据关于社会信息在言语感知中的作用的理论进行了讨论。

相似文献

1
The clear speech intelligibility benefit for text-to-speech voices: Effects of speaking style and visual guise.清晰语音对语音合成语音的可懂度增益:说话风格和视觉伪装的影响。
JASA Express Lett. 2022 Apr;2(4):045204. doi: 10.1121/10.0010274.
2
How Long Does It Take for a Voice to Become Familiar? Speech Intelligibility and Voice Recognition Are Differentially Sensitive to Voice Training.声音需要多长时间才能变得熟悉?言语可懂度和语音识别对语音训练的敏感性不同。
Psychol Sci. 2021 Jun;32(6):903-915. doi: 10.1177/0956797621991137. Epub 2021 May 12.
3
The Effects of Dysphonic Voice on Speech Intelligibility in Cantonese-Speaking Adults.发声障碍的嗓音对说粤语成年人言语可懂度的影响。
J Speech Lang Hear Res. 2021 Jan 14;64(1):16-29. doi: 10.1044/2020_JSLHR-19-00190. Epub 2020 Dec 11.
4
Intelligibility benefit for familiar voices is not accompanied by better discrimination of fundamental frequency or vocal tract length.熟悉的声音的可懂度增益并没有伴随着基频或声道长度更好的辨别力。
Hear Res. 2023 Mar 1;429:108704. doi: 10.1016/j.heares.2023.108704. Epub 2023 Jan 20.
5
Intelligibility of naturally produced and synthesized Mandarin speech by cochlear implant listeners.人工耳蜗植入者对自然产生和合成的普通话语音的可懂度。
J Acoust Soc Am. 2018 May;143(5):2886. doi: 10.1121/1.5037590.
6
Familiar Voices Are More Intelligible, Even if They Are Not Recognized as Familiar.熟悉的声音更容易理解,即使它们没有被识别为熟悉的声音。
Psychol Sci. 2018 Oct;29(10):1575-1583. doi: 10.1177/0956797618779083. Epub 2018 Aug 10.
7
Effect of Dysphonia and Cognitive-Perceptual Listener Strategies on Speech Intelligibility.嗓音障碍和认知感知听众策略对言语清晰度的影响。
J Voice. 2020 Sep;34(5):806.e7-806.e18. doi: 10.1016/j.jvoice.2019.03.013. Epub 2019 Apr 25.
8
A large-scale comparison of two voice synthesis techniques on intelligibility, naturalness, preferences, and attitudes toward voices banked by individuals with amyotrophic lateral sclerosis.两种语音合成技术在可懂度、自然度、偏好以及对由肌萎缩侧索硬化症患者存储的语音库的态度方面的大规模比较。
Augment Altern Commun. 2024 Mar;40(1):31-45. doi: 10.1080/07434618.2023.2262032. Epub 2023 Oct 4.
9
Voice banking to support individuals who use speech-generating devices: development and evaluation of Singaporean-accented English synthetic voices and a Singapore Colloquial English recording inventory.语音库银行支持使用语音生成设备的个体:新加坡口音英语合成语音和新加坡口语英语录音目录的开发和评估。
Augment Altern Commun. 2023 Dec;39(4):208-218. doi: 10.1080/07434618.2023.2181213. Epub 2023 Mar 27.
10
Intelligibility of face-masked speech depends on speaking style: Comparing casual, clear, and emotional speech.面罩语音的可懂度取决于说话方式:比较随意、清晰和情绪化的语音。
Cognition. 2021 May;210:104570. doi: 10.1016/j.cognition.2020.104570. Epub 2021 Jan 12.

引用本文的文献

1
Impaired Prosodic Processing but Not Hearing Function Is Associated with an Age-Related Reduction in AI Speech Recognition.韵律加工受损而非听力功能与年龄相关的人工耳蜗语音识别能力下降有关。
Audiol Res. 2025 Feb 8;15(1):14. doi: 10.3390/audiolres15010014.
2
Neural Dynamics of the Processing of Speech Features: Evidence for a Progression of Features from Acoustic to Sentential Processing.语音特征处理的神经动力学:从声学处理到句子处理的特征递进证据
J Neurosci. 2025 Mar 12;45(11):e1143242025. doi: 10.1523/JNEUROSCI.1143-24.2025.
3
Neural Dynamics of the Processing of Speech Features: Evidence for a Progression of Features from Acoustic to Sentential Processing.
语音特征处理的神经动力学:从声学处理到句子处理的特征递进证据
bioRxiv. 2024 Dec 10:2024.02.02.578603. doi: 10.1101/2024.02.02.578603.