• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用类似语音的刺激进行视听选择性注意任务的表现取决于说话者的身份,但与时间连贯性无关。

Performance in an Audiovisual Selective Attention Task Using Speech-Like Stimuli Depends on the Talker Identities, But Not Temporal Coherence.

机构信息

Biomedical Engineering, University of Rochester, Rochester, NY, USA.

Center for Visual Science, University of Rochester, Rochester, NY, USA.

出版信息

Trends Hear. 2023 Jan-Dec;27:23312165231207235. doi: 10.1177/23312165231207235.

DOI:10.1177/23312165231207235
PMID:37847849
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10586009/
Abstract

Audiovisual integration of speech can benefit the listener by not only improving comprehension of what a talker is saying but also helping a listener select a particular talker's voice from a mixture of sounds. Binding, an early integration of auditory and visual streams that helps an observer allocate attention to a combined audiovisual object, is likely involved in processing audiovisual speech. Although temporal coherence of stimulus features across sensory modalities has been implicated as an important cue for non-speech stimuli (Maddox et al., 2015), the specific cues that drive binding in speech are not fully understood due to the challenges of studying binding in natural stimuli. Here we used speech-like artificial stimuli that allowed us to isolate three potential contributors to binding: temporal coherence (are the face and the voice changing synchronously?), articulatory correspondence (do visual faces represent the correct phones?), and talker congruence (do the face and voice come from the same person?). In a trio of experiments, we examined the relative contributions of each of these cues. Normal hearing listeners performed a dual task in which they were instructed to respond to events in a target auditory stream while ignoring events in a distractor auditory stream (auditory discrimination) and detecting flashes in a visual stream (visual detection). We found that viewing the face of a talker who matched the attended voice (i.e., talker congruence) offered a performance benefit. We found no effect of temporal coherence on performance in this task, prompting an important recontextualization of previous findings.

摘要

言语的视听整合不仅可以提高说话者的理解能力,还有助于听众从混合声音中选择特定说话者的声音。绑定是听觉和视觉流的早期整合,有助于观察者将注意力分配到组合的视听对象上,它可能参与了视听言语的处理。尽管跨感觉模态的刺激特征的时间连贯性已被认为是非言语刺激的重要线索(Maddox 等人,2015),但由于研究自然刺激中的绑定具有挑战性,因此尚未完全了解驱动言语绑定的具体线索。在这里,我们使用了类似言语的人工刺激,使我们能够分离出绑定的三个潜在贡献因素:时间连贯性(面部和声音是否同步变化?)、发音对应(视觉面孔是否代表正确的音素?)和说话者一致性(面部和声音是否来自同一个人?)。在三项实验中,我们检查了这些线索各自的相对贡献。正常听力的听众执行了一项双重任务,他们被指示在目标听觉流中响应事件,同时忽略干扰听觉流中的事件(听觉辨别)并检测视觉流中的闪光(视觉检测)。我们发现,观看与被注意声音匹配的说话者的面部(即说话者一致性)提供了性能优势。我们在这项任务中没有发现时间连贯性对性能的影响,这促使我们对先前的发现进行了重要的重新阐释。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9da3/10586009/dbbd9b3a2a11/10.1177_23312165231207235-fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9da3/10586009/c2331f54387a/10.1177_23312165231207235-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9da3/10586009/e57e1e9b4dea/10.1177_23312165231207235-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9da3/10586009/d77f7d7d99ce/10.1177_23312165231207235-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9da3/10586009/8d2b8caa0837/10.1177_23312165231207235-fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9da3/10586009/7b0fdb79a52a/10.1177_23312165231207235-fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9da3/10586009/c5ed00500d55/10.1177_23312165231207235-fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9da3/10586009/dbbd9b3a2a11/10.1177_23312165231207235-fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9da3/10586009/c2331f54387a/10.1177_23312165231207235-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9da3/10586009/e57e1e9b4dea/10.1177_23312165231207235-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9da3/10586009/d77f7d7d99ce/10.1177_23312165231207235-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9da3/10586009/8d2b8caa0837/10.1177_23312165231207235-fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9da3/10586009/7b0fdb79a52a/10.1177_23312165231207235-fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9da3/10586009/c5ed00500d55/10.1177_23312165231207235-fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9da3/10586009/dbbd9b3a2a11/10.1177_23312165231207235-fig7.jpg

相似文献

1
Performance in an Audiovisual Selective Attention Task Using Speech-Like Stimuli Depends on the Talker Identities, But Not Temporal Coherence.使用类似语音的刺激进行视听选择性注意任务的表现取决于说话者的身份,但与时间连贯性无关。
Trends Hear. 2023 Jan-Dec;27:23312165231207235. doi: 10.1177/23312165231207235.
2
Joint population coding and temporal coherence link an attended talker's voice and location features in naturalistic multi-talker scenes.在自然主义的多说话者场景中,联合群体编码和时间连贯性将被关注说话者的语音和位置特征联系起来。
bioRxiv. 2025 Feb 12:2024.05.13.593814. doi: 10.1101/2024.05.13.593814.
3
Independent mechanisms of temporal and linguistic cue correspondence benefiting audiovisual speech processing.独立的时间和语言线索对应机制有益于视听语音处理。
Atten Percept Psychophys. 2022 Aug;84(6):2016-2026. doi: 10.3758/s13414-022-02440-3. Epub 2022 Feb 24.
4
Dissociable Neural Correlates of Multisensory Coherence and Selective Attention.分离多感觉一致性和选择性注意的神经相关物。
J Neurosci. 2023 Jun 21;43(25):4697-4708. doi: 10.1523/JNEUROSCI.1310-22.2023. Epub 2023 May 23.
5
Psychobiological Responses Reveal Audiovisual Noise Differentially Challenges Speech Recognition.心理生物学反应表明视听噪声对语音识别的挑战存在差异。
Ear Hear. 2020 Mar/Apr;41(2):268-277. doi: 10.1097/AUD.0000000000000755.
6
Congruent audiovisual speech enhances auditory attention decoding with EEG.视听语音一致增强了 EEG 对听觉注意力的解码。
J Neural Eng. 2019 Nov 6;16(6):066033. doi: 10.1088/1741-2552/ab4340.
7
Word Learning in Deaf Adults Who Use Cochlear Implants: The Role of Talker Variability and Attention to the Mouth.聋人使用人工耳蜗的单词学习:说话人变异性和对嘴部的关注的作用。
Ear Hear. 2024;45(2):337-350. doi: 10.1097/AUD.0000000000001432. Epub 2023 Sep 11.
8
A Causal Inference Model Explains Perception of the McGurk Effect and Other Incongruent Audiovisual Speech.一种因果推理模型解释了麦格克效应及其他不一致视听言语的感知。
PLoS Comput Biol. 2017 Feb 16;13(2):e1005229. doi: 10.1371/journal.pcbi.1005229. eCollection 2017 Feb.
9
Mouth and Voice: A Relationship between Visual and Auditory Preference in the Human Superior Temporal Sulcus.嘴巴与声音:人类颞上沟中视觉与听觉偏好之间的关系。
J Neurosci. 2017 Mar 8;37(10):2697-2708. doi: 10.1523/JNEUROSCI.2914-16.2017. Epub 2017 Feb 8.
10
Integrating speech information across talkers, gender, and sensory modality: female faces and male voices in the McGurk effect.整合不同说话者、性别和感觉模态的语音信息:麦格克效应中的女性面孔与男性声音
Percept Psychophys. 1991 Dec;50(6):524-36. doi: 10.3758/bf03207536.

引用本文的文献

1
Dissociable Neural Correlates of Multisensory Coherence and Selective Attention.分离多感觉一致性和选择性注意的神经相关物。
J Neurosci. 2023 Jun 21;43(25):4697-4708. doi: 10.1523/JNEUROSCI.1310-22.2023. Epub 2023 May 23.

本文引用的文献

1
Face Masks Impact Auditory and Audiovisual Consonant Recognition in Children With and Without Hearing Loss.口罩对有听力损失和无听力损失儿童的听觉及视听辅音识别产生影响。
Front Psychol. 2022 May 13;13:874345. doi: 10.3389/fpsyg.2022.874345. eCollection 2022.
2
Independent mechanisms of temporal and linguistic cue correspondence benefiting audiovisual speech processing.独立的时间和语言线索对应机制有益于视听语音处理。
Atten Percept Psychophys. 2022 Aug;84(6):2016-2026. doi: 10.3758/s13414-022-02440-3. Epub 2022 Feb 24.
3
Face mask type affects audiovisual speech intelligibility and subjective listening effort in young and older adults.
口罩类型影响年轻和老年成年人的视听言语可懂度和主观聆听努力程度。
Cogn Res Princ Implic. 2021 Jul 18;6(1):49. doi: 10.1186/s41235-021-00314-0.
4
Training enhances the ability of listeners to exploit visual information for auditory scene analysis.训练可提高听众利用视觉信息进行听觉场景分析的能力。
Cognition. 2021 Mar;208:104529. doi: 10.1016/j.cognition.2020.104529. Epub 2020 Dec 26.
5
Different languages, similar encoding efficiency: Comparable information rates across the human communicative niche.不同语言,相似的编码效率:人类交际范围内相当的信息率。
Sci Adv. 2019 Sep 4;5(9):eaaw2594. doi: 10.1126/sciadv.aaw2594. eCollection 2019 Sep.
6
Audiovisual Enhancement of Speech Perception in Noise by School-Age Children Who Are Hard of Hearing.助听学龄儿童噪声下言语感知的视听增强。
Ear Hear. 2020 Jul/Aug;41(4):705-719. doi: 10.1097/AUD.0000000000000830.
7
Task-uninformative visual stimuli improve auditory spatial discrimination in humans but not the ideal observer.任务不相关的视觉刺激可改善人类的听觉空间辨别能力,但不能改善理想观察者。
PLoS One. 2019 Sep 9;14(9):e0215417. doi: 10.1371/journal.pone.0215417. eCollection 2019.
8
Integration of Visual Information in Auditory Cortex Promotes Auditory Scene Analysis through Multisensory Binding.视觉信息在听觉皮层中的整合通过多感觉绑定促进听觉场景分析。
Neuron. 2018 Feb 7;97(3):640-655.e4. doi: 10.1016/j.neuron.2017.12.034. Epub 2018 Jan 26.
9
Defining Auditory-Visual Objects: Behavioral Tests and Physiological Mechanisms.定义视听对象:行为测试与生理机制
Trends Neurosci. 2016 Feb;39(2):74-85. doi: 10.1016/j.tins.2015.12.007. Epub 2016 Jan 15.
10
Auditory selective attention is enhanced by a task-irrelevant temporally coherent visual stimulus in human listeners.在人类听众中,与任务无关的时间连贯视觉刺激会增强听觉选择性注意。
Elife. 2015 Feb 5;4:e04995. doi: 10.7554/eLife.04995.