Suppr超能文献

在大五人格调查中,大语言模型表现出类似人类的社会期望偏差。

Large language models display human-like social desirability biases in Big Five personality surveys.

作者信息

Salecha Aadesh, Ireland Molly E, Subrahmanya Shashanka, Sedoc João, Ungar Lyle H, Eichstaedt Johannes C

机构信息

Institute for Human-Centered AI, Stanford University, Palo Alto, CA 94305, USA.

Receptiviti, Toronto, ON M5G 2K8, Canada.

出版信息

PNAS Nexus. 2024 Dec 17;3(12):pgae533. doi: 10.1093/pnasnexus/pgae533. eCollection 2024 Dec.

Abstract

Large language models (LLMs) are becoming more widely used to simulate human participants and so understanding their biases is important. We developed an experimental framework using Big Five personality surveys and uncovered a previously undetected social desirability bias in a wide range of LLMs. By systematically varying the number of questions LLMs were exposed to, we demonstrate their ability to infer when they are being evaluated. When personality evaluation is inferred, LLMs skew their scores towards the desirable ends of trait dimensions (i.e. increased extraversion, decreased neuroticism, etc.). This bias exists in all tested models, including GPT-4/3.5, Claude 3, Llama 3, and PaLM-2. Bias levels appear to increase in more recent models, with GPT-4's survey responses changing by 1.20 (human) SD and Llama 3's by 0.98 SD, which are very large effects. This bias remains after question order randomization and paraphrasing. Reverse coding the questions decreases bias levels but does not eliminate them, suggesting that this effect cannot be attributed to acquiescence bias. Our findings reveal an emergent social desirability bias and suggest constraints on profiling LLMs with psychometric tests and on this use of LLMs as proxies for human participants.

摘要

大语言模型(LLMs)正越来越广泛地用于模拟人类参与者,因此了解它们的偏差很重要。我们使用大五人格调查问卷开发了一个实验框架,并在广泛的大语言模型中发现了一种以前未被发现的社会期望偏差。通过系统地改变大语言模型接触到的问题数量,我们证明了它们推断自己何时被评估的能力。当推断出进行人格评估时,大语言模型会将其分数偏向特质维度的理想端点(即增加外向性、降低神经质等)。这种偏差存在于所有测试模型中,包括GPT-4/3.5、Claude 3、Llama 3和PaLM-2。偏差水平在更新的模型中似乎有所增加,GPT-4的调查问卷回答变化了1.20(人类)标准差,Llama 3的变化了0.98标准差,这是非常大的影响。在问题顺序随机化和释义后,这种偏差仍然存在。对问题进行反向编码会降低偏差水平,但并不能消除它们,这表明这种效应不能归因于默许偏差。我们的研究结果揭示了一种新出现的社会期望偏差,并表明在使用心理测量测试对大语言模型进行剖析以及将大语言模型用作人类参与者的替代方面存在限制。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6126/11650498/7ec190fc5a6a/pgae533f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验