• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从视觉语音中的精细语音细节学习识别不熟悉的面孔。

Learning to recognize unfamiliar faces from fine-phonetic detail in visual speech.

作者信息

Jesse Alexandra

机构信息

Department of Psychological and Brain Sciences, University of Massachusetts, 135 Hicks Way, Amherst, MA, 01003, USA.

出版信息

Atten Percept Psychophys. 2025 Apr;87(3):936-951. doi: 10.3758/s13414-025-03049-y. Epub 2025 Mar 20.

DOI:10.3758/s13414-025-03049-y
PMID:40113736
Abstract

How speech is realized varies across talkers but can be somewhat consistent within a talker. Humans are sensitive to these idiosyncrasies when perceiving auditory speech, but also, in face-to-face communications, when perceiving their visual speech. Our recent work has shown that humans can also use talker idiosyncrasies seen in how talkers produce sentences to rapidly learn to recognize unfamiliar talkers, suggesting that visual speech information can be used for speech perception and talker recognition. However, in learning from sentences, learners may focus only on global information about the talker, such as talker-specific realizations of prosody and rate. The present study tested whether human perceivers can learn the identity of the talker based solely on fine-phonetic detail in the dynamic realization of visual speech alone. Participants learned to identify talkers from point-light displays showing them uttering isolated words. These point-light displays isolated the dynamic speech information, while discarding static information about the talker's face. No sound was presented. Feedback was given only during training. Test included point-light displays of familiar words from training and of novel words. Participants learned to recognize two and four talkers from the word-level dynamics of visual speech from very little exposure. The established representations allowed talker recognition independent of linguistic content-that is, even from novel words. Spoken words therefore contain sufficient indexical information in their fine-phonetic detail for perceivers to acquire dynamic facial representations for unfamiliar talkers that allows generalization across words. Dynamic representations of talking faces are formed for the recognition of unfamiliar faces.

摘要

语音的实现方式因说话者而异,但在同一个说话者身上可能会有一定的一致性。人类在感知听觉语音时对这些特质很敏感,而且在面对面交流中感知视觉语音时也是如此。我们最近的研究表明,人类还可以利用说话者造句方式中体现出的特质来快速学会识别不熟悉的说话者,这表明视觉语音信息可用于语音感知和说话者识别。然而,在从句子中学习时,学习者可能只关注关于说话者的全局信息,比如说话者特有的韵律和语速表现。本研究测试了人类感知者是否能够仅基于视觉语音动态实现中的精细语音细节来学习说话者的身份。参与者通过观看点光源显示来学习识别说话者,这些显示呈现的是他们说出单个单词的画面。这些点光源显示隔离了动态语音信息,同时摒弃了关于说话者面部的静态信息。没有播放声音。仅在训练期间提供反馈。测试包括来自训练的熟悉单词和新单词的点光源显示。参与者通过极少的接触,从视觉语音的单词级动态中学会识别两名和四名说话者。所建立的表征使得说话者识别能够独立于语言内容——也就是说,即使是对于新单词也能识别。因此,口语单词在其精细语音细节中包含足够的索引信息,让感知者能够为不熟悉的说话者获取动态面部表征,从而实现跨单词的泛化。为了识别不熟悉的面孔,会形成说话面孔的动态表征。

相似文献

1
Learning to recognize unfamiliar faces from fine-phonetic detail in visual speech.从视觉语音中的精细语音细节学习识别不熟悉的面孔。
Atten Percept Psychophys. 2025 Apr;87(3):936-951. doi: 10.3758/s13414-025-03049-y. Epub 2025 Mar 20.
2
Learning to recognize unfamiliar talkers: Listeners rapidly form representations of facial dynamic signatures.学习识别不熟悉的说话者:听众能快速对面部动态特征形成印象。
Cognition. 2018 Jul;176:195-208. doi: 10.1016/j.cognition.2018.03.018. Epub 2018 Mar 28.
3
Hierarchical contributions of linguistic knowledge to talker identification: Phonological versus lexical familiarity.语言知识对说话者识别的分层贡献:语音与词汇熟悉度
Atten Percept Psychophys. 2019 May;81(4):1088-1107. doi: 10.3758/s13414-019-01778-5.
4
Talker familiarity and the accommodation of talker variability.说话人熟悉度与说话人变异性的顺应。
Atten Percept Psychophys. 2021 May;83(4):1842-1860. doi: 10.3758/s13414-020-02203-y. Epub 2021 Jan 4.
5
Generalization to unfamiliar talkers in artificial language learning.人工语言学习中对不熟悉说话者的泛化。
Psychon Bull Rev. 2013 Aug;20(4):780-9. doi: 10.3758/s13423-013-0402-7.
6
Talker-specific learning in speech perception.语音感知中的说话者特定学习。
Percept Psychophys. 1998 Apr;60(3):355-76. doi: 10.3758/bf03206860.
7
Effect of training on word-recognition performance in noise for young normal-hearing and older hearing-impaired listeners.训练对年轻听力正常者和老年听力受损者在噪声环境下单词识别能力的影响。
Ear Hear. 2006 Jun;27(3):263-78. doi: 10.1097/01.aud.0000215980.21158.a2.
8
Recognition of speech spectrograms.语音频谱图的识别。
J Acoust Soc Am. 1984 Jul;76(1):32-43. doi: 10.1121/1.391035.
9
Effects of cross-language voice training on speech perception: whose familiar voices are more intelligible?跨语言语音训练对言语感知的影响:谁的熟悉声音更易理解?
J Acoust Soc Am. 2011 Dec;130(6):4053-62. doi: 10.1121/1.3651816.
10
Effects of talker variability on perceptual learning of dialects.说话者变异性对方言感知学习的影响。
Lang Speech. 2004;47(Pt 3):207-39. doi: 10.1177/00238309040470030101.

本文引用的文献

1
Hearing is believing: Lexically guided perceptual learning is graded to reflect the quantity of evidence in speech input.眼见为实:词汇引导的知觉学习具有等级性,以反映语音输入中的证据量。
Cognition. 2023 Jun;235:105404. doi: 10.1016/j.cognition.2023.105404. Epub 2023 Feb 20.
2
Long-term within-speaker consistency of filled pauses in native and non-native speech.母语和非母语演讲者填充停顿的长期内讲者一致性。
JASA Express Lett. 2022 Mar;2(3):035201. doi: 10.1121/10.0009598.
3
In Search of Salience: Focus Detection in the Speech of Different Talkers.
在不同说话者的语音中寻找焦点:焦点检测
Lang Speech. 2022 Sep;65(3):650-680. doi: 10.1177/00238309211046029. Epub 2021 Nov 28.
4
Encoding and decoding of meaning through structured variability in intonational speech prosody.通过语调韵律的结构化可变性来对意义进行编码和解码。
Cognition. 2021 Jun;211:104619. doi: 10.1016/j.cognition.2021.104619. Epub 2021 Feb 15.
5
The effects of high variability training on voice identity learning.高变异性训练对语音身份学习的影响。
Cognition. 2019 Dec;193:104026. doi: 10.1016/j.cognition.2019.104026. Epub 2019 Jul 16.
6
Use and Usefulness of Dynamic Face Stimuli for Face Perception Studies-a Review of Behavioral Findings and Methodology.动态面部刺激在面部感知研究中的应用及效用——行为学研究结果与方法综述
Front Psychol. 2018 Aug 3;9:1355. doi: 10.3389/fpsyg.2018.01355. eCollection 2018.
7
Learning to recognize unfamiliar talkers: Listeners rapidly form representations of facial dynamic signatures.学习识别不熟悉的说话者:听众能快速对面部动态特征形成印象。
Cognition. 2018 Jul;176:195-208. doi: 10.1016/j.cognition.2018.03.018. Epub 2018 Mar 28.
8
Inferring causes during speech perception.言语感知过程中的因果推断。
Cognition. 2018 May;174:55-70. doi: 10.1016/j.cognition.2018.01.003. Epub 2018 Feb 6.
9
The adult face-diet: A naturalistic observation study.成人面部饮食:一项自然观察研究。
Vision Res. 2019 Apr;157:222-229. doi: 10.1016/j.visres.2018.01.001. Epub 2018 Feb 1.
10
Individual Talker and Token Covariation in the Production of Multiple Cues to Stop Voicing.在发出多个停止发声线索时个体说话者与标记的协变
Phonetica. 2018;75(1):1-23. doi: 10.1159/000448809. Epub 2017 Jun 9.