语言反馈有助于快速适应语音清晰度下降的言语。

Linguistic feedback supports rapid adaptation to acoustically degraded speech.

作者信息

Sun Wenhui, Zou Jiajie, Zhu Tianyi, Sun Zhoujian, Ding Nai

机构信息

Research Center for Life Sciences Computing, Zhejiang Lab, Hangzhou 311121, China.

Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou 310027, China.

出版信息

iScience. 2024 May 22;27(6):110055. doi: 10.1016/j.isci.2024.110055. eCollection 2024 Jun 21.

DOI:10.1016/j.isci.2024.110055

PMID:38868204

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11167482/

Abstract

Humans can quickly adapt to recognize acoustically degraded speech, and here we hypothesize that the quick adaptation is enabled by internal linguistic feedback - Listeners use partially recognized sentences to adapt the mapping between acoustic features and phonetic labels. We test this hypothesis by quantifying how quickly humans adapt to degraded speech and analyzing whether the adaptation process can be simulated by adapting an automatic speech recognition (ASR) system based on its own speech recognition results. We consider three types of acoustic degradation, i.e., noise vocoding, time compression, and local time-reversal. The human speech recognition rate can increase by >20% after exposure to just a few acoustically degraded sentences. Critically, the ASR system with internal linguistic feedback can adapt to degraded speech with human-level speed and accuracy. These results suggest that self-supervised learning based on linguistic feedback is a plausible strategy for human adaptation to acoustically degraded speech.

摘要

人类能够迅速适应识别声学上退化的语音，在此我们假设这种快速适应是由内部语言反馈实现的——听者使用部分识别的句子来调整声学特征与语音标签之间的映射。我们通过量化人类适应退化语音的速度，并分析基于自身语音识别结果调整自动语音识别（ASR）系统是否能够模拟适应过程，来检验这一假设。我们考虑三种类型的声学退化，即噪声声码、时间压缩和局部时间反转。仅接触少数声学退化的句子后，人类语音识别率可提高20%以上。关键的是，具有内部语言反馈的ASR系统能够以人类水平的速度和精度适应退化语音。这些结果表明，基于语言反馈的自监督学习是人类适应声学退化语音的一种合理策略。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c5d/11167482/6f78328a47d5/fx1.jpg

相似文献

Linguistic feedback supports rapid adaptation to acoustically degraded speech.

iScience. 2024 May 22;27(6):110055. doi: 10.1016/j.isci.2024.110055. eCollection 2024 Jun 21.

Bilingualism and Speech Understanding in Noise: Auditory and Linguistic Factors.

J Am Acad Audiol. 2019 Feb;30(2):115-130. doi: 10.3766/jaaa.17082. Epub 2018 Jan 10.

Errors on a Speech-in-Babble Sentence Recognition Test Reveal Individual Differences in Acoustic Phonetic Perception and Babble Misallocations.

Ear Hear. 2021 May/Jun;42(3):673-690. doi: 10.1097/AUD.0000000000001020.

Semantic Predictability Facilitates Comprehension of Degraded Speech in a Graded Manner.

Front Psychol. 2021 Sep 9;12:714485. doi: 10.3389/fpsyg.2021.714485. eCollection 2021.

Effects of training length on adaptation to noise-vocoded speech.

J Acoust Soc Am. 2024 Mar 1;155(3):2114-2127. doi: 10.1121/10.0025273.

Advances in Completely Automated Vowel Analysis for Sociophonetics: Using End-to-End Speech Recognition Systems With DARLA.

Front Artif Intell. 2021 Sep 24;4:662097. doi: 10.3389/frai.2021.662097. eCollection 2021.

Individual differences in visual pattern completion predict adaptation to degraded speech.

Brain Lang. 2024 Aug;255:105449. doi: 10.1016/j.bandl.2024.105449. Epub 2024 Jul 30.

Interdependence of linguistic and indexical speech perception skills in school-age children with early cochlear implantation.

Ear Hear. 2013 Sep;34(5):562-74. doi: 10.1097/AUD.0b013e31828d2bd6.

Spectral and Temporal Envelope Cues for Human and Automatic Speech Recognition in Noise.

J Assoc Res Otolaryngol. 2020 Feb;21(1):73-87. doi: 10.1007/s10162-019-00737-z. Epub 2019 Nov 22.

The influence of age, hearing, and working memory on the speech comprehension benefit derived from an automatic speech recognition system.

Ear Hear. 2009 Apr;30(2):262-72. doi: 10.1097/AUD.0b013e3181987063.

本文引用的文献

Speech recognition in echoic environments and the effect of aging and hearing impairment.

Hear Res. 2023 Apr;431:108725. doi: 10.1016/j.heares.2023.108725. Epub 2023 Feb 26.

The time course of adaptation to distorted speech.

J Acoust Soc Am. 2022 Apr;151(4):2636. doi: 10.1121/10.0010235.

Rapid computations of spectrotemporal prediction error support perception of degraded speech.

Elife. 2020 Nov 4;9:e58077. doi: 10.7554/eLife.58077.

Rapid Perceptual Learning: A Potential Source of Individual Differences in Speech Perception Under Adverse Conditions?

Trends Hear. 2020 Jan-Dec;24:2331216520930541. doi: 10.1177/2331216520930541.

Relationship between perceptual learning in speech and statistical learning in younger and older adults.

Front Hum Neurosci. 2014 Sep 1;8:628. doi: 10.3389/fnhum.2014.00628. eCollection 2014.

Mechanisms of noise robust representation of speech in primary auditory cortex.

Proc Natl Acad Sci U S A. 2014 May 6;111(18):6792-7. doi: 10.1073/pnas.1318017111. Epub 2014 Apr 21.

Emergence of neural encoding of auditory objects while listening to competing speakers.

Proc Natl Acad Sci U S A. 2012 Jul 17;109(29):11854-9. doi: 10.1073/pnas.1205381109. Epub 2012 Jul 2.

Rapid perceptual learning of noise-vocoded speech requires attention.

J Acoust Soc Am. 2012 Mar;131(3):EL236-42. doi: 10.1121/1.3685511.

Perceptual adaptation and intelligibility of multiple talkers for two types of degraded speech.

J Acoust Soc Am. 2009 Nov;126(5):2660-9. doi: 10.1121/1.3212930.

Perceptual learning of noise vocoded words: effects of feedback and lexicality.

J Exp Psychol Hum Percept Perform. 2008 Apr;34(2):460-74. doi: 10.1037/0096-1523.34.2.460.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

语言反馈有助于快速适应语音清晰度下降的言语。

Linguistic feedback supports rapid adaptation to acoustically degraded speech.

作者信息

Sun Wenhui, Zou Jiajie, Zhu Tianyi, Sun Zhoujian, Ding Nai

机构信息

Research Center for Life Sciences Computing, Zhejiang Lab, Hangzhou 311121, China.

Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou 310027, China.

出版信息

iScience. 2024 May 22;27(6):110055. doi: 10.1016/j.isci.2024.110055. eCollection 2024 Jun 21.

DOI:10.1016/j.isci.2024.110055

PMID:38868204

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11167482/

Abstract

摘要

语言反馈有助于快速适应语音清晰度下降的言语。

Linguistic feedback supports rapid adaptation to acoustically degraded speech.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

语言反馈有助于快速适应语音清晰度下降的言语。

Linguistic feedback supports rapid adaptation to acoustically degraded speech.

作者信息

机构信息

出版信息

相似文献

本文引用的文献