Rubin Timothy N, Kievit-Kylar Brent, Willits Jon A, Jones Michael N
Cogsci. 2014;2014:1329-1334.
Semantic models play an important role in cognitive science. These models use statistical learning to model word meanings from co-occurrences in text corpora. A wide variety of semantic models have been proposed, and the literature has typically emphasized situations in which one model outperforms another. However, because these models often vary with respect to multiple sub-processes (e.g., their normalization or dimensionality-reduction methods), it can be difficult to delineate which of these processes are responsible for observed performance differences. Furthermore, the fact that any two models may vary along multiple dimensions makes it difficult to understand where these models fall within the space of possible psychological theories. In this paper, we propose a general framework for organizing the space of semantic models. We then illustrate how this framework can be used to understand model comparisons in terms of individual manipulations along sub-processes. Using several artificial datasets we show how both representational structure and dimensionality-reduction influence a model's ability to pick up on different types of word relationships.
语义模型在认知科学中发挥着重要作用。这些模型使用统计学习从文本语料库中的共现情况来对词义进行建模。已经提出了各种各样的语义模型,并且文献通常强调一种模型优于另一种模型的情况。然而,由于这些模型通常在多个子过程方面存在差异(例如,它们的归一化或降维方法),因此很难确定这些过程中的哪些导致了观察到的性能差异。此外,任意两个模型可能在多个维度上存在差异这一事实使得难以理解这些模型在可能的心理学理论空间中的位置。在本文中,我们提出了一个用于组织语义模型空间的通用框架。然后我们说明如何使用这个框架从沿着子过程的个体操作的角度来理解模型比较。使用几个人工数据集,我们展示了表征结构和降维如何影响模型捕捉不同类型词关系的能力。