Suppr超能文献

竞争条件下的声源特性、跨共振峰整合与言语可懂度

Acoustic source characteristics, across-formant integration, and speech intelligibility under competitive conditions.

作者信息

Roberts Brian, Summers Robert J, Bailey Peter J

机构信息

Psychology, School of Life and Health Sciences, Aston University.

Department of Psychology, University of York.

出版信息

J Exp Psychol Hum Percept Perform. 2015 Jun;41(3):680-91. doi: 10.1037/xhp0000038. Epub 2015 Mar 9.

Abstract

An important aspect of speech perception is the ability to group or select formants using cues in the acoustic source characteristics--for example, fundamental frequency (F0) differences between formants promote their segregation. This study explored the role of more radical differences in source characteristics. Three-formant (F1+F2+F3) synthetic speech analogues were derived from natural sentences. In Experiment 1, F1+F3 were generated by passing a harmonic glottal source (F0 = 140 Hz) through second-order resonators (H1+H3); in Experiment 2, F1+F3 were tonal (sine-wave) analogues (T1+T3). F2 could take either form (H2 or T2). In some conditions, the target formants were presented alone, either monaurally or dichotically (left ear = F1+F3; right ear = F2). In others, they were accompanied by a competitor for F2 (F1+F2C+F3; F2), which listeners must reject to optimize recognition. Competitors (H2C or T2C) were created using the time-reversed frequency and amplitude contours of F2. Dichotic presentation of F2 and F2C ensured that the impact of the competitor arose primarily through informational masking. In the absence of F2C, the effect of a source mismatch between F1+F3 and F2 was relatively modest. When F2C was present, intelligibility was lowest when F2 was tonal and F2C was harmonic, irrespective of which type matched F1+F3. This finding suggests that source type and context, rather than similarity, govern the phonetic contribution of a formant. It is proposed that wideband harmonic analogues are more effective informational maskers than narrowband tonal analogues, and so become dominant in across-frequency integration of phonetic information when placed in competition.

摘要

言语感知的一个重要方面是利用声源特征中的线索对共振峰进行分组或选择的能力——例如,共振峰之间的基频(F0)差异促进了它们的分离。本研究探讨了声源特征中更显著差异的作用。从自然句子中导出了三共振峰(F1+F2+F3)合成语音类似物。在实验1中,F1+F3是通过使谐波声门源(F0 = 140 Hz)通过二阶谐振器(H1+H3)生成的;在实验2中,F1+F3是音调(正弦波)类似物(T1+T3)。F2可以采用任何一种形式(H2或T2)。在某些条件下,目标共振峰单独呈现,单耳或双耳呈现(左耳 = F1+F3;右耳 = F2)。在其他条件下,它们伴随着F2的竞争音(F1+F2C+F3;F2),听众必须排除该竞争音以优化识别。竞争音(H2C或T2C)是使用F2的时间反转频率和幅度轮廓创建的。F2和F2C的双耳呈现确保了竞争音的影响主要通过信息掩蔽产生。在没有F2C的情况下,F1+F3和F2之间的声源不匹配效应相对较小。当存在F2C时,无论哪种类型与F1+F3匹配,当F2是音调且F2C是谐波时,可懂度最低。这一发现表明,声源类型和语境而非相似性决定了共振峰的语音贡献。有人提出,宽带谐波类似物比窄带音调类似物更有效的信息掩蔽器,因此在竞争中进行语音信息的跨频率整合时占主导地位。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcb3/4445382/81ada86ac489/xhp_41_3_680_fig1a.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验