Suppr超能文献

测量相似度的空间排列方法可以捕捉到高维语义结构。

The spatial arrangement method of measuring similarity can capture high-dimensional semantic structures.

机构信息

University of Pennsylvania, 3720 Walnut St, Philadelphia, PA, 19104, USA.

New Mexico State University, Las Cruces, NM, USA.

出版信息

Behav Res Methods. 2020 Oct;52(5):1906-1928. doi: 10.3758/s13428-020-01362-y.

Abstract

Psychologists collect similarity data to study a variety of phenomena including categorization, generalization and discrimination, and representation itself. However, collecting similarity judgments between all pairs of items in a set is expensive, spurring development of techniques like the Spatial Arrangement Method (SpAM; Goldstone, Behavior Research Methods, Instruments, & Computers, 26, 381-386, 1994), wherein participants place items on a two-dimensional plane such that proximity reflects perceived similarity. While SpAM greatly hastens similarity measurement, and has been successfully used for lower-dimensional, perceptual stimuli, its suitability for higher-dimensional, conceptual stimuli is less understood. In study 1, we evaluated the ability of SpAM to capture the semantic structure of eight different categories composed of 20-30 words each. First, SpAM distances correlated strongly (r = .71) with pairwise similarity judgments, although below SpAM and pairwise judgment split-half reliabilities (r's > .9). Second, a cross-validation exercise with multidimensional scaling fits at increasing latent dimensionalities suggested that aggregated SpAM data favored higher (> 2) dimensional solutions for seven of the eight categories explored here. Third, split-half reliability of SpAM dissimilarities was high (Pearson r = .90), while the average correlation between pairs of participants was low (r = .15), suggesting that when different participants focus on different pairs of stimulus dimensions, reliable high-dimensional aggregate similarity data is recoverable. In study 2, we show that SpAM can recover the Big Five factor space of personality trait adjectives, and that cross-validation favors a four- or five-dimension solution on this dataset. We conclude that SpAM is an accurate and reliable method of measuring similarity for high-dimensional items like words. We publicly release our data for researchers.

摘要

心理学家收集相似性数据来研究多种现象,包括分类、概括和辨别,以及表示本身。然而,收集一个集合中所有对物品之间的相似性判断是昂贵的,这促使了像空间排列法(SpAM;Goldstone,行为研究方法,仪器和计算机,26,381-386,1994)这样的技术的发展,其中参与者将物品放置在二维平面上,使得接近度反映了感知到的相似性。虽然 SpAM 极大地加快了相似性的测量,并且已经成功地用于低维、感知刺激,但它对更高维、概念性刺激的适用性了解较少。在研究 1 中,我们评估了 SpAM 捕捉由 20-30 个单词组成的八个不同类别语义结构的能力。首先,SpAM 距离与成对相似性判断高度相关(r =.71),尽管低于 SpAM 和成对判断的分半可靠性(r's >.9)。其次,在增加潜在维度的多维标度拟合的交叉验证练习中,对于这里探索的八个类别中的七个,聚合的 SpAM 数据支持更高(> 2)维的解决方案。第三,SpAM 不相似性的分半可靠性很高(Pearson r =.90),而参与者之间的平均相关性很低(r =.15),这表明当不同的参与者关注不同的刺激维度对时,可以恢复可靠的高维聚合相似性数据。在研究 2 中,我们表明 SpAM 可以恢复人格特质形容词的大五因素空间,并且交叉验证支持在这个数据集上的四或五维解决方案。我们的结论是,SpAM 是一种准确可靠的高维项目相似性测量方法,如单词。我们为研究人员公开发布我们的数据。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验