School of Foreign Studies, Minzu University of China, No. 27 Zhongguancun South Avenue, Beijing, 100081, People's Republic of China.
Department of Philosophy, King's College London, Strand, London, WC2R 2LS, UK.
J Psycholinguist Res. 2022 Aug;51(4):917-931. doi: 10.1007/s10936-022-09872-w. Epub 2022 Mar 29.
Distribution information plays an important role in word categorization. In this paper, we present a novel distributional model, distributional lattices to discover syntactic categories in child directed speech. A distributional lattice is a hierarchy formed by closed sets of words that are distributionally similar. Such a hierarchy is potentially useful for capturing syntactic categories by clustering words with associate patterns they occur in. In order to empirically support the suggestion that the distributional lattice is effective at categorizing words, we present a distributional lattice analysis of the Brent corpus of child-directed speech. The results show that distributional lattices are able to yield extremely accurate syntactic categories.
分布信息在词分类中起着重要作用。在本文中,我们提出了一种新的分布模型——分布格,用于在儿童导向语言中发现句法类别。分布格是通过具有相似分布的单词的闭集形成的层次结构。通过聚类具有关联模式的单词及其出现的模式,这种层次结构对于捕获句法类别可能是有用的。为了从经验上支持这样一种观点,即分布格在分类单词方面是有效的,我们对儿童导向语言的布伦特语料库进行了分布格分析。结果表明,分布格能够产生非常准确的句法类别。