Suppr超能文献

高通和低通声学滤波对儿童视听语音冗余及益处的影响

Impact of High- and Low-Pass Acoustic Filtering on Audiovisual Speech Redundancy and Benefit in Children.

作者信息

Lalonde Kaylah, Dwyer Grace, Bosen Adam, Pitts Abby

机构信息

Center for Hearing Research, Boys Town National Research Hospital, Omaha, Nebraska, USA.

出版信息

Ear Hear. 2025;46(3):735-746. doi: 10.1097/AUD.0000000000001622. Epub 2025 Jan 31.

Abstract

OBJECTIVES

To investigate the influence of frequency-specific audibility on audiovisual benefit in children, this study examined the impact of high- and low-pass acoustic filtering on auditory-only and audiovisual word and sentence recognition in children with typical hearing. Previous studies show that visual speech provides greater access to consonant place of articulation than other consonant features and that low-pass filtering has a strong impact on perception on acoustic consonant place of articulation. This suggests visual speech may be particularly useful when acoustic speech is low-pass filtered because it provides complementary information about consonant place of articulation. Therefore, we hypothesized that audiovisual benefit would be greater for low-pass filtered words than high-pass filtered speech. We assessed whether this pattern of results would translate to sentence recognition.

DESIGN

Children with typical hearing completed auditory-only and audiovisual tests of consonant-vowel-consonant word and sentence recognition across conditions differing in acoustic frequency content: a low-pass filtered condition in which children could only access acoustic content below 2 kHz and a high-pass filtered condition in which children could only access acoustic content above 2 kHz. They also completed a visual-only test of consonant-vowel-consonant word recognition. We analyzed word, consonant, and keyword-in-sentence recognition and consonant feature (place, voice/manner of articulation) transmission accuracy across modalities and filter conditions using binomial general linear mixed models. To assess the degree to which visual speech is complementary versus redundant with acoustic speech, we calculated the proportion of auditory-only target and response consonant pairs that we can tell apart using only visual speech and compared these values between high-pass and low-pass filter conditions.

RESULTS

In auditory-only conditions, recognition accuracy was lower for low-pass filtered consonants and consonant features than high-pass filtered consonants and consonant features, especially consonant place of articulation. In visual-only conditions, recognition accuracy was greater for consonant place of articulation than consonant voice/manner of articulation. In addition, auditory consonants in the low-pass filtered condition were more likely to be substituted for visually distinct consonants, meaning that there was more opportunity to use visual cues to supplement missing auditory information in the low-pass filtered condition. Audiovisual benefit for isolated whole words was greater for low-pass filtered speech than high-pass filtered speech. No difference in audiovisual benefit between filter conditions was observed for phonemes, features, or words-in-sentences. Ceiling effects limit the interpretation of these nonsignificant interactions.

CONCLUSIONS

For isolated word recognition, visual speech is more complementary with the acoustic speech cues children can access when high-frequency acoustic content is eliminated by low-pass filtering than when low-frequency acoustic content is eliminated by high-pass filtering. This decreased auditory-visual phonetic redundancy is accompanied by larger audiovisual benefit. In contrast, audiovisual benefit for sentence recognition did not differ between low-pass and high-pass filtered speech. This might reflect ceiling effects in audiovisual conditions or a decrease in the contribution of auditory-visual phonetic redundancy to explaining audiovisual benefit for connected speech. These results from children with typical hearing suggest that some variance in audiovisual benefit among children who are hard of hearing may depend in part on frequency-specific audibility.

摘要

目的

为了研究特定频率可听度对儿童视听增益的影响,本研究考察了高通和低通声学滤波对听力正常儿童的纯听觉及视听单词和句子识别的影响。先前的研究表明,与其他辅音特征相比,视觉语音能让人更清楚地了解辅音的发音部位,并且低通滤波对声学辅音发音部位的感知有很大影响。这表明,当声学语音经过低通滤波时,视觉语音可能特别有用,因为它提供了关于辅音发音部位的补充信息。因此,我们假设低通滤波单词的视听增益要大于高通滤波语音。我们评估了这种结果模式是否会转化为句子识别。

设计

听力正常的儿童完成了在声学频率内容不同的条件下对辅音-元音-辅音单词和句子识别的纯听觉及视听测试:一种低通滤波条件,即儿童只能获取2kHz以下的声学内容;一种高通滤波条件,即儿童只能获取2kHz以上的声学内容。他们还完成了辅音-元音-辅音单词识别的纯视觉测试。我们使用二项式广义线性混合模型分析了跨模态和滤波条件的单词、辅音和句子中的关键词识别以及辅音特征(发音部位、发音方式/方法)的传递准确性。为了评估视觉语音与声学语音互补而非冗余的程度,我们计算了仅使用视觉语音就能区分的纯听觉目标和反应辅音对的比例,并比较了高通和低通滤波条件下的这些值。

结果

在纯听觉条件下,低通滤波辅音和辅音特征的识别准确率低于高通滤波辅音和辅音特征,尤其是辅音发音部位。在纯视觉条件下,辅音发音部位的识别准确率高于辅音发音方式/方法。此外,低通滤波条件下的听觉辅音更有可能被视觉上不同的辅音替代,这意味着在低通滤波条件下有更多机会使用视觉线索来补充缺失的听觉信息。低通滤波语音的孤立完整单词的视听增益大于高通滤波语音。在音素、特征或句子中的单词方面,未观察到滤波条件之间的视听增益差异。天花板效应限制了对这些无显著交互作用的解释。

结论

对于孤立单词识别,当高频声学内容通过低通滤波被消除时,视觉语音与儿童能够获取的声学语音线索的互补性比高频声学内容通过高通滤波被消除时更强。这种听觉-视觉语音冗余的减少伴随着更大的视听增益。相比之下,低通和高通滤波语音在句子识别方面的视听增益没有差异。这可能反映了视听条件下的天花板效应,或者听觉-视觉语音冗余对解释连贯语音视听增益的贡献减少。这些听力正常儿童的结果表明,听力受损儿童之间视听增益的一些差异可能部分取决于特定频率的可听度。

相似文献

本文引用的文献

3
Band importance for speech-in-speech recognition.语音中语音识别的频段重要性。
JASA Express Lett. 2021 Aug;1(8):084402. doi: 10.1121/10.0005762. Epub 2021 Aug 2.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验