重新评估视听整合对语音感知和可懂度的益处。

Reassessing the Benefits of Audiovisual Integration to Speech Perception and Intelligibility.

作者信息

O'Hanlon Brandon, Plack Christopher J, Nuttall Helen E

机构信息

Department of Psychology, Lancaster University, United Kingdom.

Manchester Centre for Audiology and Deafness, The University of Manchester, United Kingdom.

出版信息

J Speech Lang Hear Res. 2025 Jan 2;68(1):26-39. doi: 10.1044/2024_JSLHR-24-00162. Epub 2024 Dec 2.

DOI:10.1044/2024_JSLHR-24-00162

PMID:39620981

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11842087/

Abstract

PURPOSE

In difficult listening conditions, the visual system assists with speech perception through lipreading. Stimulus onset asynchrony (SOA) is used to investigate the interaction between the two modalities in speech perception. Previous estimates of audiovisual benefit and SOA integration period differ widely. A limitation of previous research is a lack of consideration of visemes-categories of phonemes defined by similar lip movements when produced by a speaker-to ensure that selected phonemes are visually distinct. This study aimed to reassess the benefits of audiovisual lipreading to speech perception when different viseme categories are selected as stimuli and presented in noise. The study also aimed to investigate the effects of SOA on these stimuli.

METHOD

Sixty participants were tested online and presented with audio-only and audiovisual stimuli containing the speaker's lip movements. The speech was presented either with or without noise and had six different SOAs (0, 200, 216.6, 233.3, 250, and 266.6 ms). Participants discriminated between speech syllables with button presses.

RESULTS

The benefit of visual information was weaker than that in previous studies. There was a significant increase in reaction times as SOA was introduced, but there were no significant effects of SOA on accuracy. Furthermore, exploratory analyses suggest that the effect was not equal across viseme categories: "Ba" was more difficult to recognize than "ka" in noise.

CONCLUSION

In summary, the findings suggest that the contributions of audiovisual integration to speech processing are weaker when considering visemes but are not sufficient to identify a full integration period.

SUPPLEMENTAL MATERIAL

https://doi.org/10.23641/asha.27641064.

摘要

目的

在困难的听力条件下，视觉系统通过唇读辅助言语感知。刺激起始异步性（SOA）用于研究言语感知中两种模态之间的相互作用。先前对视听益处和SOA整合期的估计差异很大。先前研究的一个局限性是，在选择音素时没有考虑由说话者产生的相似唇动定义的视位类别，以确保所选音素在视觉上是不同的。本研究旨在重新评估当选择不同视位类别作为刺激并在噪声中呈现时，视听唇读对言语感知的益处。该研究还旨在研究SOA对这些刺激的影响。

方法

60名参与者进行在线测试，呈现包含说话者唇动的纯音频和视听刺激。言语在有或无噪声的情况下呈现，并有六种不同的SOA（0、200、216.6、233.3、250和266.6毫秒）。参与者通过按键区分语音音节。

结果

视觉信息的益处比先前研究中的要弱。随着SOA的引入，反应时间显著增加，但SOA对准确性没有显著影响。此外，探索性分析表明，不同视位类别之间的影响并不相同：在噪声中，“Ba”比“ka”更难识别。

结论

总之，研究结果表明，在考虑视位时，视听整合对言语处理的贡献较弱，但不足以确定一个完整的整合期。

补充材料

https://doi.org/10.23641/asha.27641064 。

相似文献

Reassessing the Benefits of Audiovisual Integration to Speech Perception and Intelligibility.重新评估视听整合对语音感知和可懂度的益处。

J Speech Lang Hear Res. 2025 Jan 2;68(1):26-39. doi: 10.1044/2024_JSLHR-24-00162. Epub 2024 Dec 2.

Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中，如果患者出现以下症状和体征，可判断其是否患有 COVID-19。

Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

Interventions for childhood apraxia of speech.儿童言语失用症的干预措施。

Cochrane Database Syst Rev. 2018 May 30;5(5):CD006278. doi: 10.1002/14651858.CD006278.pub3.

A rapid and systematic review of the clinical effectiveness and cost-effectiveness of paclitaxel, docetaxel, gemcitabine and vinorelbine in non-small-cell lung cancer.对紫杉醇、多西他赛、吉西他滨和长春瑞滨在非小细胞肺癌中的临床疗效和成本效益进行的快速系统评价。

Health Technol Assess. 2001;5(32):1-195. doi: 10.3310/hta5320.

Effectiveness of voice rehabilitation on vocalisation in postlaryngectomy patients: a systematic review.喉切除术后患者的嗓音康复对发声效果的影响：系统评价。

Int J Evid Based Healthc. 2010 Dec;8(4):256-8. doi: 10.1111/j.1744-1609.2010.00177.x.

Hearing Instruments for Unilateral Severe-to-Profound Sensorineural Hearing Loss in Adults: A Systematic Review and Meta-Analysis.成人单侧重度至极重度感音神经性听力损失的听力仪器：系统评价与荟萃分析

Ear Hear. 2016 Sep-Oct;37(5):495-507. doi: 10.1097/AUD.0000000000000313.

Home treatment for mental health problems: a systematic review.心理健康问题的居家治疗：一项系统综述

Health Technol Assess. 2001;5(15):1-139. doi: 10.3310/hta5150.

A rapid and systematic review of the clinical effectiveness and cost-effectiveness of topotecan for ovarian cancer.拓扑替康治疗卵巢癌的临床有效性和成本效益的快速系统评价。

Health Technol Assess. 2001;5(28):1-110. doi: 10.3310/hta5280.

Using Pupillometry in Virtual Reality as a Tool for Speech-in-Noise Research.在虚拟现实中使用瞳孔测量法作为噪声环境下语音研究的工具。

Ear Hear. 2025 Jul 2. doi: 10.1097/AUD.0000000000001692.

Drugs for preventing postoperative nausea and vomiting in adults after general anaesthesia: a network meta-analysis.成人全身麻醉后预防术后恶心呕吐的药物：网状Meta分析

Cochrane Database Syst Rev. 2020 Oct 19;10(10):CD012859. doi: 10.1002/14651858.CD012859.pub2.

本文引用的文献

Audiovisual speech perception: Moving beyond McGurk.视听言语感知：超越麦格克效应。

J Acoust Soc Am. 2022 Dec;152(6):3216. doi: 10.1121/10.0015262.

Effects of Wearing Face Masks While Using Different Speaking Styles in Noise on Speech Intelligibility During the COVID-19 Pandemic.新冠疫情期间，在有噪音环境下使用不同说话方式时佩戴口罩对言语可懂度的影响。

Front Psychol. 2021 Jun 28;12:682677. doi: 10.3389/fpsyg.2021.682677. eCollection 2021.

Face mask type affects audiovisual speech intelligibility and subjective listening effort in young and older adults.口罩类型影响年轻和老年成年人的视听言语可懂度和主观聆听努力程度。

Cogn Res Princ Implic. 2021 Jul 18;6(1):49. doi: 10.1186/s41235-021-00314-0.

Face masks and speaking style affect audio-visual word recognition and memory of native and non-native speech.口罩和说话方式会影响母语和非母语语音的视听词识别和记忆。

J Acoust Soc Am. 2021 Jun;149(6):4013. doi: 10.1121/10.0005191.

The Impact of Temporally Coherent Visual Cues on Speech Perception in Complex Auditory Environments.时间连贯视觉线索对复杂听觉环境中语音感知的影响。

Front Neurosci. 2021 Jun 7;15:678029. doi: 10.3389/fnins.2021.678029. eCollection 2021.

Long-Term Musical Training Alters Auditory Cortical Activity to the Frequency Change.长期音乐训练会改变听觉皮层对频率变化的活动。

Front Hum Neurosci. 2020 Aug 21;14:329. doi: 10.3389/fnhum.2020.00329. eCollection 2020.

About Face: Seeing the Talker Improves Spoken Word Recognition but Increases Listening Effort.转变视角：看到说话者能提高口语单词识别能力，但会增加听力负担。

J Cogn. 2019 Nov 22;2(1):44. doi: 10.5334/joc.89.

Musical Expertise Affects Audiovisual Speech Perception: Findings From Event-Related Potentials and Inter-trial Phase Coherence.音乐专业技能影响视听言语感知：来自事件相关电位和试次间相位一致性的研究结果。

Front Psychol. 2019 Nov 15;10:2562. doi: 10.3389/fpsyg.2019.02562. eCollection 2019.

The Principle of Inverse Effectiveness in Audiovisual Speech Perception.视听言语感知中的逆有效性原则。

Front Hum Neurosci. 2019 Sep 26;13:335. doi: 10.3389/fnhum.2019.00335. eCollection 2019.

Combined predictive effects of sentential and visual constraints in early audiovisual speech processing.句子和视觉约束在早期视听言语加工中的综合预测作用。

Sci Rep. 2019 May 27;9(1):7870. doi: 10.1038/s41598-019-44311-2.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验