Department of Experimental Psychology, University College London, 26 Bedford Way, London, WC1H, 0AP, UK.
Behav Res Methods. 2024 Aug;56(5):4786-4801. doi: 10.3758/s13428-023-02216-z. Epub 2023 Aug 21.
Mouth and facial movements are part and parcel of face-to-face communication. The primary way of assessing their role in speech perception has been by manipulating their presence (e.g., by blurring the area of a speaker's lips) or by looking at how informative different mouth patterns are for the corresponding phonemes (or visemes; e.g., /b/ is visually more salient than /g/). However, moving beyond informativeness of single phonemes is challenging due to coarticulation and language variations (to name just a few factors). Here, we present mouth and facial informativeness (MaFI) for words, i.e., how visually informative words are based on their corresponding mouth and facial movements. MaFI was quantified for 2276 English words, varying in length, frequency, and age of acquisition, using phonological distance between a word and participants' speechreading guesses. The results showed that MaFI norms capture well the dynamic nature of mouth and facial movements per word, with words containing phonemes with roundness and frontness features, as well as visemes characterized by lower lip tuck, lip rounding, and lip closure being visually more informative. We also showed that the more of these features there are in a word, the more informative it is based on mouth and facial movements. Finally, we demonstrated that the MaFI norms generalize across different variants of English language. The norms are freely accessible via Open Science Framework ( https://osf.io/mna8j/ ) and can benefit any language researcher using audiovisual stimuli (e.g., to control for the effect of speech-linked mouth and facial movements).
口面部运动是面对面交流的重要组成部分。评估它们在言语感知中的作用的主要方法是通过操纵它们的存在(例如,模糊说话者嘴唇的区域),或者通过观察不同口型模式对相应音素(或视位;例如,/b/ 在视觉上比/g/更突出)的信息含量。然而,由于协同发音和语言变化(仅举几例),超越单个音素的信息含量具有挑战性。在这里,我们提出了单词的口面部信息量(MaFI),即基于相应的口部和面部运动,单词在视觉上的信息量。使用单词与参与者的语音猜测之间的语音距离,对 2276 个英语单词的 MaFI 进行了量化,这些单词的长度、频率和习得年龄各不相同。结果表明,MaFI 规范很好地捕捉了每个单词的口部和面部运动的动态性质,其中包含圆唇和前元音特征的音素以及以下唇回缩、唇圆化和唇闭为特征的视位的单词在视觉上更具信息量。我们还表明,一个单词中包含的这些特征越多,基于口部和面部运动的信息量就越大。最后,我们证明了 MaFI 规范在不同英语变体中具有通用性。这些规范可通过开放科学框架(https://osf.io/mna8j/)免费获取,任何使用视听刺激的语言研究人员都可以从中受益(例如,控制与言语相关的口部和面部运动的影响)。