School of Electronic Engineering and Computer Science, Queen Mary University of London, London, UK.
Department of Psychology, New York University, New York, NY, USA.
Behav Res Methods. 2019 Aug;51(4):1706-1716. doi: 10.3758/s13428-019-01201-9.
With the explosion of "big data," digital repositories of texts and images are growing rapidly. These datasets present new opportunities for psychological research, but they require new methodologies before researchers can use these datasets to yield insights into human cognition. We present a new method that allows psychological researchers to take advantage of text and image databases: a procedure for measuring human categorical representations over large datasets of items, such as arbitrary words or pictures. We call this method discrete Markov chain Monte Carlo with people (d-MCMCP). We illustrate our method by evaluating the following categories over datasets: emotions as represented by facial images, moral concepts as represented by relevant words, and seasons as represented by images drawn from large online databases. Three experiments demonstrate that d-MCMCP is powerful and flexible enough to work with complex, naturalistic stimuli drawn from large online databases.
随着“大数据”的爆炸式增长,文本和图像的数字知识库也在迅速增长。这些数据集为心理学研究带来了新的机会,但在研究人员能够利用这些数据集深入了解人类认知之前,需要新的方法。我们提出了一种新的方法,使心理学研究人员能够利用文本和图像数据库:一种在大量项目(如任意单词或图片)的数据集上测量人类分类表示的程序。我们将这种方法称为基于人的离散马尔可夫链蒙特卡罗(d-MCMCP)。我们通过评估以下类别中的数据集来举例说明我们的方法:面部图像所代表的情绪、相关单词所代表的道德概念以及从大型在线数据库中提取的图像所代表的季节。三项实验表明,d-MCMCP 足够强大和灵活,可以处理来自大型在线数据库的复杂、自然的刺激。