Cheung Vanessa, Maier Maximilian, Lieder Falk
Department of Experimental Psychology, University College London, London WC1H 0AP, United Kingdom.
Department of Psychology, University of California, Los Angeles, CA 90095.
Proc Natl Acad Sci U S A. 2025 Jun 24;122(25):e2412015122. doi: 10.1073/pnas.2412015122. Epub 2025 Jun 20.
As large language models (LLMs) become more widely used, people increasingly rely on them to make or advise on moral decisions. Some researchers even propose using LLMs as participants in psychology experiments. It is, therefore, important to understand how well LLMs make moral decisions and how they compare to humans. We investigated these questions by asking a range of LLMs to emulate or advise on people's decisions in realistic moral dilemmas. In Study 1, we compared LLM responses to those of a representative U.S. sample ( = 285) for 22 dilemmas, including both collective action problems that pitted self-interest against the greater good, and moral dilemmas that pitted utilitarian cost-benefit reasoning against deontological rules. In collective action problems, LLMs were more altruistic than participants. In moral dilemmas, LLMs exhibited stronger omission bias than participants: They usually endorsed inaction over action. In Study 2 ( = 474, preregistered), we replicated this omission bias and documented an additional bias: Unlike humans, most LLMs were biased toward answering "no" in moral dilemmas, thus flipping their decision/advice depending on how the question is worded. In Study 3 ( = 491, preregistered), we replicated these biases in LLMs using everyday moral dilemmas adapted from forum posts on Reddit. In Study 4, we investigated the sources of these biases by comparing models with and without fine-tuning, showing that they likely arise from fine-tuning models for chatbot applications. Our findings suggest that uncritical reliance on LLMs' moral decisions and advice could amplify human biases and introduce potentially problematic biases.
随着大语言模型(LLMs)的使用越来越广泛,人们越来越依赖它们来做出道德决策或提供道德决策建议。一些研究人员甚至提议将大语言模型用作心理学实验的参与者。因此,了解大语言模型在做出道德决策方面的表现以及它们与人类的比较情况非常重要。我们通过要求一系列大语言模型在现实的道德困境中模拟或为人们的决策提供建议来研究这些问题。在研究1中,我们将大语言模型的回答与一个具有代表性的美国样本(n = 285)在22个困境中的回答进行了比较,这些困境包括将自身利益与更大利益对立起来的集体行动问题,以及将功利主义成本效益推理与道义论规则对立起来的道德困境。在集体行动问题中,大语言模型比参与者更利他。在道德困境中,大语言模型表现出比参与者更强的不作为偏差:它们通常支持不作为而非行动。在研究2(n = 474,预先注册)中,我们重复了这种不作为偏差,并记录了另一种偏差:与人类不同,大多数大语言模型在道德困境中倾向于回答“否”,从而根据问题的措辞改变它们的决策/建议。在研究3(n = 491,预先注册)中,我们使用从Reddit论坛帖子改编的日常道德困境在大语言模型中重复了这些偏差。在研究4中,我们通过比较经过微调与未经过微调的模型来研究这些偏差的来源,结果表明它们可能源于为聊天机器人应用对模型进行的微调。我们的研究结果表明,不加批判地依赖大语言模型的道德决策和建议可能会放大人类的偏差,并引入潜在的问题偏差。