Laboratoire Vision Action Cognition, Université Paris Descartes, Paris Sorbonne Cité, France.
Behav Res Methods. 2013 Sep;45(3):758-64. doi: 10.3758/s13428-012-0285-y.
The aim of this article is to describe a database of diphone positional frequencies in French. More specifically, we provide frequencies for word-initial, word-internal, and word-final diphones of all words extracted from a subtitle corpus of 50 million words that come from movie and TV series dialogue. We also provide intra- and intersyllable diphone frequencies, as well as interword diphone frequencies. To our knowledge, no other such tool is available to psycholinguists for the study of French sequential probabilities. This database and its new indicators should help researchers conducting new studies on speech segmentation.
本文的目的是描述一个法语双音子位置频率数据库。更具体地说,我们提供了从一个包含 5000 万个单词的字幕语料库中提取的所有单词的词首、词中和词尾双音子频率,这些单词来自电影和电视剧对话。我们还提供了音节内和音节间双音子频率以及词间双音子频率。据我们所知,对于法语序列概率的研究,没有其他此类工具可供心理语言学家使用。这个数据库及其新指标应该有助于研究人员开展关于语音分割的新研究。