Suppr超能文献

谨慎使用大型语言模型作为人类替代者。

Take caution in using LLMs as human surrogates.

作者信息

Gao Yuan, Lee Dokyun, Burtch Gordon, Fazelpour Sina

机构信息

Department of Information Systems, Questrom School of Business, Boston University, Boston, MA 02215.

Faculty of Computing and Data Sciences, Boston University, Boston, MA 02215.

出版信息

Proc Natl Acad Sci U S A. 2025 Jun 17;122(24):e2501660122. doi: 10.1073/pnas.2501660122. Epub 2025 Jun 13.

Abstract

Recent studies suggest large language models (LLMs) can generate human-like responses, aligning with human behavior in economic experiments, surveys, and political discourse. This has led many to propose that LLMs can be used as surrogates or simulations for humans in social science research. However, LLMs differ fundamentally from humans, relying on probabilistic patterns, absent the embodied experiences or survival objectives that shape human cognition. We assess the reasoning depth of LLMs using the 11-20 money request game. Nearly all advanced approaches fail to replicate human behavior distributions across many models. The causes of failure are diverse and unpredictable, relating to input language, roles, safeguarding, and more. These results warrant caution in using LLMs as surrogates or for simulating human behavior in research.

摘要

近期研究表明,大语言模型(LLMs)能够生成类人反应,在经济实验、调查和政治话语中与人类行为保持一致。这使得许多人提议,在社会科学研究中,大语言模型可用作人类的替代物或模拟对象。然而,大语言模型与人类有着根本区别,它们依赖概率模式,缺乏塑造人类认知的具体体验或生存目标。我们使用11 - 20金钱请求游戏来评估大语言模型的推理深度。几乎所有先进方法都无法在众多模型中复制人类行为分布。失败原因多种多样且不可预测,涉及输入语言、角色、保护措施等诸多方面。这些结果警示我们,在研究中使用大语言模型作为替代物或模拟人类行为时需谨慎。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ed10/12184514/6a7fdea0bffa/pnas.2501660122fig01.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验