Kakouros Sofoklis, Räsänen Okko
Department of Signal Processing and Acoustics, Aalto University.
Cogn Sci. 2016 Sep;40(7):1739-1774. doi: 10.1111/cogs.12306. Epub 2015 Oct 20.
Numerous studies have examined the acoustic correlates of sentential stress and its underlying linguistic functionality. However, the mechanism that connects stress cues to the listener's attentional processing has remained unclear. Also, the learnability versus innateness of stress perception has not been widely discussed. In this work, we introduce a novel perspective to the study of sentential stress and put forward the hypothesis that perceived sentence stress in speech is related to the unpredictability of prosodic features, thereby capturing the attention of the listener. As predictability is based on the statistical structure of the speech input, the hypothesis also suggests that stress perception is a result of general statistical learning mechanisms. To study this idea, computational simulations are performed where temporal prosodic trajectories are modeled with an n-gram model. Probabilities of the feature trajectories are subsequently evaluated on a set of novel utterances and compared to human perception of stress. The results show that the low-probability regions of F0 and energy trajectories are strongly correlated with stress perception, giving support to the idea that attention and unpredictability of sensory stimulus are mutually connected.
许多研究已经考察了句子重音的声学关联及其潜在的语言功能。然而,将重音线索与听众注意力处理联系起来的机制仍不明确。此外,重音感知的可学习性与先天性问题尚未得到广泛讨论。在这项工作中,我们为句子重音的研究引入了一个新的视角,并提出了一个假设,即言语中感知到的句子重音与韵律特征的不可预测性有关,从而吸引听众的注意力。由于可预测性基于语音输入的统计结构,该假设还表明重音感知是一般统计学习机制的结果。为了研究这一观点,我们进行了计算模拟,其中用n元语法模型对时间韵律轨迹进行建模。随后在一组新的话语上评估特征轨迹的概率,并将其与人类的重音感知进行比较。结果表明,F0和能量轨迹的低概率区域与重音感知密切相关,这支持了注意力与感觉刺激的不可预测性相互关联的观点。