言语孤立运动学显示中跨模态源信息的规范

Specification of cross-modal source information in isolated kinematic displays of speech.

作者信息

Lachs Lorin, Pisoni David B

机构信息

Department of Psychology, 5310 North Campus Drive, California State University, Fresno, California 93740, USA.

出版信息

J Acoust Soc Am. 2004 Jul;116(1):507-18. doi: 10.1121/1.1757454.

DOI:10.1121/1.1757454

PMID:15296010

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3429945/

Abstract

Information about the acoustic properties of a talker's voice is available in optical displays of speech, and vice versa, as evidenced by perceivers' ability to match faces and voices based on vocal identity. The present investigation used point-light displays (PLDs) of visual speech and sinewave replicas of auditory speech in a cross-modal matching task to assess perceivers' ability to match faces and voices under conditions when only isolated kinematic information about vocal tract articulation was available. These stimuli were also used in a word recognition experiment under auditory-alone and audiovisual conditions. The results showed that isolated kinematic displays provide enough information to match the source of an utterance across sensory modalities. Furthermore, isolated kinematic displays can be integrated to yield better word recognition performance under audiovisual conditions than under auditory-alone conditions. The results are discussed in terms of their implications for describing the nature of speech information and current theories of speech perception and spoken word recognition.

摘要

说话者声音的声学特性信息可在语音的光学显示中获取，反之亦然，这一点已通过感知者基于声音特征匹配面孔和声音的能力得到证明。本研究在跨模态匹配任务中使用了视觉语音的点光显示（PLD）和听觉语音的正弦波复制品，以评估在仅提供有关声道发音的孤立运动学信息的条件下，感知者匹配面孔和声音的能力。这些刺激还用于仅听觉和视听条件下的单词识别实验。结果表明，孤立的运动学显示提供了足够的信息来跨感官模态匹配话语的来源。此外，在视听条件下，孤立的运动学显示可以整合起来，从而产生比仅听觉条件下更好的单词识别性能。我们将根据这些结果对描述语音信息的本质以及当前语音感知和口语单词识别理论的意义进行讨论。

相似文献

Specification of cross-modal source information in isolated kinematic displays of speech.

J Acoust Soc Am. 2004 Jul;116(1):507-18. doi: 10.1121/1.1757454.

Cross-modal source information and spoken word recognition.

J Exp Psychol Hum Percept Perform. 2004 Apr;30(2):378-96. doi: 10.1037/0096-1523.30.2.378.

Visual and audiovisual speech perception with color and gray-scale facial images.

Percept Psychophys. 2000 Oct;62(7):1394-404. doi: 10.3758/bf03212141.

"Putting the face to the voice": matching identity across modality.

Curr Biol. 2003 Sep 30;13(19):1709-14. doi: 10.1016/j.cub.2003.09.005.

Mechanisms of enhancing visual-speech recognition by prior auditory information.

Neuroimage. 2013 Jan 15;65:109-18. doi: 10.1016/j.neuroimage.2012.09.047. Epub 2012 Sep 27.

Recognizing prosody across modalities, face areas and speakers: examining perceivers' sensitivity to variable realizations of visual prosody.

Cognition. 2012 Mar;122(3):442-53. doi: 10.1016/j.cognition.2011.11.013. Epub 2011 Dec 21.

Cross-modal enhancement of speech detection in young and older adults: does signal content matter?

Ear Hear. 2011 Sep-Oct;32(5):650-5. doi: 10.1097/AUD.0b013e31821a4578.

The role of audiovisual asynchrony in person recognition.

Q J Exp Psychol (Hove). 2010 Jan;63(1):23-30. doi: 10.1080/17470210903144376. Epub 2009 Aug 10.

Mouth and Voice: A Relationship between Visual and Auditory Preference in the Human Superior Temporal Sulcus.

J Neurosci. 2017 Mar 8;37(10):2697-2708. doi: 10.1523/JNEUROSCI.2914-16.2017. Epub 2017 Feb 8.

Automatic audiovisual integration in speech perception.

Exp Brain Res. 2005 Nov;167(1):66-75. doi: 10.1007/s00221-005-0008-z. Epub 2005 Oct 29.

引用本文的文献

Learning to recognize unfamiliar faces from fine-phonetic detail in visual speech.

Atten Percept Psychophys. 2025 Apr;87(3):936-951. doi: 10.3758/s13414-025-03049-y. Epub 2025 Mar 20.

Where on the face do we look during phonemic restoration: An eye-tracking study.

Front Psychol. 2023 May 25;14:1005186. doi: 10.3389/fpsyg.2023.1005186. eCollection 2023.

Matching Unfamiliar Voices to Static and Dynamic Faces: No Evidence for a Dynamic Face Advantage in a Simultaneous Presentation Paradigm.

Front Psychol. 2019 Aug 23;10:1957. doi: 10.3389/fpsyg.2019.01957. eCollection 2019.

Audiovisual speech perception: A new approach and implications for clinical populations.

Lang Linguist Compass. 2017 Mar;11(3):77-91. doi: 10.1111/lnc3.12237. Epub 2017 Mar 26.

Temporal voice areas exist in autism spectrum disorder but are dysfunctional for voice identity recognition.

Soc Cogn Affect Neurosci. 2016 Nov;11(11):1812-1822. doi: 10.1093/scan/nsw089. Epub 2016 Jun 30.

Matching novel face and voice identity using static and dynamic facial images.

Atten Percept Psychophys. 2016 Apr;78(3):868-79. doi: 10.3758/s13414-015-1045-8.

Seeing to hear? Patterns of gaze to speaking faces in children with autism spectrum disorders.

Front Psychol. 2014 May 8;5:397. doi: 10.3389/fpsyg.2014.00397. eCollection 2014.

Experience with a talker can transfer across modalities to facilitate lipreading.

Atten Percept Psychophys. 2013 Oct;75(7):1359-65. doi: 10.3758/s13414-013-0534-x.

Crossmodal Source Identification in Speech Perception.

Ecol Psychol. 2004;16(3):159-187. doi: 10.1207/s15326969eco1603_1.

Implicit multisensory associations influence voice recognition.

PLoS Biol. 2006 Oct;4(10):e326. doi: 10.1371/journal.pbio.0040326.

本文引用的文献

Crossmodal Source Identification in Speech Perception.

Ecol Psychol. 2004;16(3):159-187. doi: 10.1207/s15326969eco1603_1.

Multimodal perceptual organization of speech: Evidence from tone analogs of spoken utterances.

Speech Commun. 1998 Oct 1;26(1):65-73. doi: 10.1016/S0167-6393(98)00050-8.

Hearing a face: cross-modal speaker matching using isolated visible speech.

Percept Psychophys. 2006 Jan;68(1):84-93. doi: 10.3758/bf03193658.

Cross-modal source information and spoken word recognition.

J Exp Psychol Hum Percept Perform. 2004 Apr;30(2):378-96. doi: 10.1037/0096-1523.30.2.378.

"Putting the face to the voice": matching identity across modality.

Curr Biol. 2003 Sep 30;13(19):1709-14. doi: 10.1016/j.cub.2003.09.005.

The influence of the lexicon on speech read word recognition: contrasting segmental and lexical distinctiveness.

Psychon Bull Rev. 2002 Jun;9(2):341-7. doi: 10.3758/bf03196291.

Visual speech information for face recognition.

Percept Psychophys. 2002 Feb;64(2):220-9. doi: 10.3758/bf03195788.

Chimaeric sounds reveal dichotomies in auditory perception.

Nature. 2002 Mar 7;416(6876):87-90. doi: 10.1038/416087a.

The effect of speechreading on masked detection thresholds for filtered speech.

J Acoust Soc Am. 2001 May;109(5 Pt 1):2272-5. doi: 10.1121/1.1362687.

Infants' perception of the audible, visible, and bimodal attributes of multimodal syllables.

Child Dev. 2000 Sep-Oct;71(5):1241-57. doi: 10.1111/1467-8624.00226.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

言语孤立运动学显示中跨模态源信息的规范

Specification of cross-modal source information in isolated kinematic displays of speech.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献