Department of Psychology, University of Southern California.
Department of Computer Science, Stony Brook University.
J Pers Soc Psychol. 2024 Feb;126(2):312-331. doi: 10.1037/pspp0000480. Epub 2023 Sep 7.
Traditional methods of personality assessment, and survey-based research in general, cannot make inferences about new items that have not been surveyed previously. This limits the amount of information that can be obtained from a given survey. In this article, we tackle this problem by leveraging recent advances in statistical natural language processing. Specifically, we extract "embedding" representations of questionnaire items from deep neural networks, trained on large-scale English language data. These embeddings allow us to construct a high-dimensional space of items, in which linguistically similar items are located near each other. We combine item embeddings with machine learning algorithms to extrapolate participant ratings of personality items to completely new items that have not been rated by any participants. The accuracy of our approach is on par with incentivized human judges given an identical task, indicating that it predicts ratings of new personality items as accurately as people do. Our approach is also capable of identifying psychological constructs associated with questionnaire items and can accurately cluster items into their constructs based only on their language content. Overall, our results show how representations of linguistic personality descriptors obtained from deep language models can be used to model and predict a large variety of traits, scales, and constructs. In doing so, they showcase a new scalable and cost-effective method for psychological measurement. (PsycInfo Database Record (c) 2024 APA, all rights reserved).
传统的人格评估方法和一般的基于调查的研究都无法对以前未调查过的新项目进行推断。这限制了从给定调查中获得的信息量。在本文中,我们利用统计自然语言处理的最新进展来解决这个问题。具体来说,我们从基于大规模英语数据训练的深度神经网络中提取问卷项目的“嵌入”表示。这些嵌入允许我们构建一个项目的高维空间,其中语言相似的项目彼此靠近。我们将项目嵌入与机器学习算法相结合,将参与者对人格项目的评分外推到完全没有任何参与者评分的新项目上。我们的方法的准确性与给定相同任务的激励人类评委相当,这表明它可以像人一样准确地预测新人格项目的评分。我们的方法还能够识别与问卷项目相关的心理结构,并仅根据其语言内容准确地将项目聚类到其结构中。总的来说,我们的结果表明,从深度语言模型中获得的语言人格描述符的表示可以用于对各种特征、量表和结构进行建模和预测。通过这样做,他们展示了一种新的可扩展且具有成本效益的心理测量方法。