Hornsby Adam N, Evans Thomas, Riefer Peter S, Prior Rosie, Love Bradley C
1University College London, London, UK.
2dunnhumby, 184 Shepherds Bush Road, London, W6 7NL UK.
Comput Brain Behav. 2020;3(2):162-173. doi: 10.1007/s42113-019-00064-9. Epub 2019 Oct 7.
Computational models using text corpora have proved useful in understanding the nature of language and human concepts. One appeal of this work is that text, such as from newspaper articles, should reflect human behaviour and conceptual organization outside the laboratory. However, texts do not directly reflect human activity, but instead serve a communicative function and are highly curated or edited to suit an audience. Here, we apply methods devised for text to a data source that directly reflects thousands of individuals' activity patterns. Using product co-occurrence data from nearly 1.3-m supermarket shopping baskets, we trained a topic model to learn 25 high-level concepts (or ). These topics were found to be comprehensible and coherent by both retail experts and consumers. The topics indicated that human concepts are primarily organized around goals and interactions (e.g. tomatoes go well with vegetables in a salad), rather than their intrinsic features (e.g. defining a tomato by the fact that it has seeds and is fleshy). These results are consistent with the notion that human conceptual knowledge is tailored to support action. Individual differences in the topics sampled predicted basic demographic characteristics. Our findings suggest that human activity patterns can reveal conceptual organization and may give rise to it.
使用文本语料库的计算模型已被证明有助于理解语言和人类概念的本质。这项工作的一个吸引力在于,诸如报纸文章之类的文本应该反映实验室之外的人类行为和概念组织。然而,文本并不直接反映人类活动,而是具有交际功能,并且经过高度策划或编辑以迎合受众。在这里,我们将为文本设计的方法应用于直接反映数千人活动模式的数据源。利用来自近130万个超市购物篮的商品共现数据,我们训练了一个主题模型来学习25个高级概念(或主题)。零售专家和消费者都发现这些主题是可理解且连贯的。这些主题表明,人类概念主要围绕目标和相互作用组织(例如,西红柿与沙拉中的蔬菜搭配得很好),而不是围绕其内在特征(例如,通过西红柿有种子且肉质来定义西红柿)。这些结果与人类概念知识是为支持行动而量身定制的观点一致。所抽取主题中的个体差异预测了基本人口统计学特征。我们的研究结果表明,人类活动模式可以揭示概念组织,也可能产生概念组织。