Macmillan-Scott Olivia, Musolesi Mirco
Department of Computer Science, University College London, London, UK.
Department of Computer Science and Engineering, University of Bologna, Bologna, Italy.
R Soc Open Sci. 2024 Jun 5;11(6):240255. doi: 10.1098/rsos.240255. eCollection 2024 Jun.
Do large language models (LLMs) display rational reasoning? LLMs have been shown to contain human biases due to the data they have been trained on; whether this is reflected in rational reasoning remains less clear. In this paper, we answer this question by evaluating seven language models using tasks from the cognitive psychology literature. We find that, like humans, LLMs display irrationality in these tasks. However, the way this irrationality is displayed does not reflect that shown by humans. When incorrect answers are given by LLMs to these tasks, they are often incorrect in ways that differ from human-like biases. On top of this, the LLMs reveal an additional layer of irrationality in the significant inconsistency of the responses. Aside from the experimental results, this paper seeks to make a methodological contribution by showing how we can assess and compare different capabilities of these types of models, in this case with respect to rational reasoning.
大语言模型(LLMs)是否展现出理性推理能力?研究表明,由于大语言模型所训练的数据,它们存在人类偏见;而这是否会在理性推理中体现,仍不太明确。在本文中,我们通过使用认知心理学文献中的任务来评估七个语言模型,从而回答这个问题。我们发现,与人类一样,大语言模型在这些任务中表现出非理性。然而,这种非理性的表现方式与人类不同。当大语言模型在这些任务中给出错误答案时,其错误方式往往与类似人类的偏见不同。除此之外,大语言模型在回答的显著不一致中还展现出了另一层非理性。除了实验结果,本文还试图通过展示我们如何评估和比较这些类型模型的不同能力(在这种情况下是关于理性推理),做出方法论上的贡献。