Business School, University of Mannheim.
GESIS-Leibniz Institute for the Social Sciences.
Perspect Psychol Sci. 2024 Sep;19(5):808-826. doi: 10.1177/17456916231214460. Epub 2024 Jan 2.
We illustrate how standard psychometric inventories originally designed for assessing noncognitive human traits can be repurposed as diagnostic tools to evaluate analogous traits in large language models (LLMs). We start from the assumption that LLMs, inadvertently yet inevitably, acquire psychological traits (metaphorically speaking) from the vast text corpora on which they are trained. Such corpora contain sediments of the personalities, values, beliefs, and biases of the countless human authors of these texts, which LLMs learn through a complex training process. The traits that LLMs acquire in such a way can potentially influence their behavior, that is, their outputs in downstream tasks and applications in which they are employed, which in turn may have real-world consequences for individuals and social groups. By eliciting LLMs' responses to language-based psychometric inventories, we can bring their traits to light. Psychometric profiling enables researchers to study and compare LLMs in terms of noncognitive characteristics, thereby providing a window into the personalities, values, beliefs, and biases these models exhibit (or mimic). We discuss the history of similar ideas and outline possible psychometric approaches for LLMs. We demonstrate one promising approach, zero-shot classification, for several LLMs and psychometric inventories. We conclude by highlighting open challenges and future avenues of research for AI Psychometrics.
我们说明了如何将原本设计用于评估非认知人类特质的标准心理计量学量表重新用于评估大型语言模型(LLM)中的类似特质的诊断工具。我们的出发点是,LLM 通过其训练的庞大文本语料库,在无意间但不可避免地获得了心理特质(可以这样比喻)。这些语料库包含了无数文本作者的个性、价值观、信仰和偏见的痕迹,LLM 通过复杂的训练过程来学习这些痕迹。LLM 以这种方式获得的特质可能会影响它们的行为,也就是说,它们在下游任务和应用中的输出,这反过来又可能对个人和社会群体产生现实世界的影响。通过引出 LLM 对基于语言的心理计量学量表的反应,我们可以揭示它们的特质。心理计量学分析使研究人员能够根据非认知特征来研究和比较 LLM,从而为这些模型所表现出的个性、价值观、信仰和偏见(或模仿)提供一个窗口。我们讨论了类似想法的历史,并概述了用于 LLM 的可能的心理计量学方法。我们展示了针对几种 LLM 和心理计量学量表的一种很有前途的方法,即零样本分类。最后,我们强调了 AI 心理计量学的开放性挑战和未来研究方向。