Suppr超能文献

黑白之间的感知:语调变量和过滤条件对社会语言学判断的影响及其对自动语音识别的启示

Perception in Black and White: Effects of Intonational Variables and Filtering Conditions on Sociolinguistic Judgments With Implications for ASR.

作者信息

Holliday Nicole R

机构信息

University of Pennsylvania, Philadelphia, PA, United States.

出版信息

Front Artif Intell. 2021 Jul 15;4:642783. doi: 10.3389/frai.2021.642783. eCollection 2021.

Abstract

This study tests the effects of intonational contours and filtering conditions on listener judgments of ethnicity to arrive at a more comprehensive understanding on how prosody influences these judgments, with implications for austomatic speech recognition systems as well as speech synthesis. In a perceptual experiment, 40 American English listeners heard phrase-long clips which were controlled for pitch accent type and focus marking. Each clip contained either two H* (high) or two L+H* (low high) pitch accents and a L-L% (falling) boundary tone, and had also previously been labelled for broad or narrow focus. Listeners rated clips in two tasks, one with unmodified stimuli and one with stimuli lowpass filtered at 400 Hz, and were asked to judge whether the speaker was "Black" or "White". In the filtered condition, tokens with the L+H* pitch accent were more likely to be rated as "Black", with an interaction such that broad focus enhanced this pattern, supporting earlier findings that listeners may perceive African American Language as having more variation in possible pitch accent meanings. In the unfiltered condition, tokens with the L+H* pitch accent were less likely to be rated as Black, with no effect of focus, likely due to the fact that listeners relied more heavily on available segmental information in this condition. These results enhance our understanding of cues listeners rely on in making social judgments about speakers, especially in ethnic identification and linguistic profiling, by highlighting perceptual differences due to listening environment as well as predicted meaning of specific intonational contours. They also contribute to our understanding of the role of how human listeners interpret meaning within a holistic context, which has implications for the construction of computational systems designed to replicate the properties of natural language. In particular, they have important applicability to speech synthesis and speech recognition programs, which are often limited in their capacities due to the fact that they do not make such holistic sociolinguistic considerations of the meanings of input or output speech.

摘要

本研究测试了语调轮廓和滤波条件对听众种族判断的影响,以更全面地了解韵律如何影响这些判断,这对自动语音识别系统以及语音合成具有启示意义。在一项感知实验中,40名美国英语听众收听了时长为短语的音频片段,这些片段在音高重音类型和焦点标记方面受到控制。每个片段包含两个H*(高)或两个L+H*(低高)音高重音以及一个L-L%(下降)边界调,并且之前也已被标记为宽泛焦点或狭窄焦点。听众在两项任务中对音频片段进行评分,一项任务使用未修改的刺激,另一项任务使用在400赫兹进行低通滤波的刺激,并被要求判断说话者是“黑人”还是“白人”。在滤波条件下,带有L+H音高重音的片段更有可能被评为“黑人”,存在一种交互作用,即宽泛焦点增强了这种模式,这支持了早期的研究结果,即听众可能认为非裔美国语言在可能的音高重音含义方面有更多变化。在未滤波条件下,带有L+H音高重音的片段被评为黑人的可能性较小,且没有焦点效应,这可能是因为在这种情况下听众更依赖可用的音段信息。这些结果通过突出由于聆听环境以及特定语调轮廓的预测含义而产生的感知差异,增强了我们对听众在对说话者进行社会判断时所依赖线索的理解,尤其是在种族识别和语言特征分析方面。它们还有助于我们理解人类听众在整体语境中如何解释意义的作用,这对旨在复制自然语言属性的计算系统的构建具有启示意义。特别是,它们对语音合成和语音识别程序具有重要的适用性,这些程序由于没有对输入或输出语音的意义进行这种整体的社会语言学考虑而往往能力有限。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验