在改良的道德图灵测试中对人工代理的归因。

Attributions toward artificial agents in a modified Moral Turing Test.

机构信息

Department of Psychology, Georgia State University, Atlanta, GA, USA.

Department of Philosophy, Georgia State University, Atlanta, GA, USA.

出版信息

Sci Rep. 2024 Apr 30;14(1):8458. doi: 10.1038/s41598-024-58087-7.

DOI:10.1038/s41598-024-58087-7

PMID:38688951

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11061136/

Abstract

Advances in artificial intelligence (AI) raise important questions about whether people view moral evaluations by AI systems similarly to human-generated moral evaluations. We conducted a modified Moral Turing Test (m-MTT), inspired by Allen et al. (Exp Theor Artif Intell 352:24-28, 2004) proposal, by asking people to distinguish real human moral evaluations from those made by a popular advanced AI language model: GPT-4. A representative sample of 299 U.S. adults first rated the quality of moral evaluations when blinded to their source. Remarkably, they rated the AI's moral reasoning as superior in quality to humans' along almost all dimensions, including virtuousness, intelligence, and trustworthiness, consistent with passing what Allen and colleagues call the comparative MTT. Next, when tasked with identifying the source of each evaluation (human or computer), people performed significantly above chance levels. Although the AI did not pass this test, this was not because of its inferior moral reasoning but, potentially, its perceived superiority, among other possible explanations. The emergence of language models capable of producing moral responses perceived as superior in quality to humans' raises concerns that people may uncritically accept potentially harmful moral guidance from AI. This possibility highlights the need for safeguards around generative language models in matters of morality.

摘要

人工智能（AI）的进步提出了一个重要问题，即人们是否会像对待人类生成的道德评价一样，看待 AI 系统的道德评价。我们借鉴了 Allen 等人（Exp Theor Artif Intell 352:24-28, 2004）的提议，进行了改良后的道德图灵测试（m-MTT），要求人们区分真实的人类道德评价和广受欢迎的先进 AI 语言模型 GPT-4 所做出的道德评价。我们从美国随机抽取了 299 名成年人作为样本，在不知道评价来源的情况下，首先对道德评价的质量进行了评分。令人惊讶的是，他们几乎在所有维度上都认为 AI 的道德推理质量优于人类，这与 Allen 等人所称的比较性 MTT 相符。接下来，当被要求识别每个评价的来源（人类或计算机）时，人们的表现明显高于随机水平。尽管 AI 没有通过这项测试，但这并不是因为它的道德推理能力较差，而是可能因为它在其他可能的解释中被认为具有优越性。能够生成被认为在质量上优于人类的道德反应的语言模型的出现，引发了人们对人们可能会不加批判地接受 AI 提供的潜在有害道德指导的担忧。这种可能性凸显了在涉及道德问题时，对生成性语言模型进行保障措施的必要性。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

在改良的道德图灵测试中对人工代理的归因。

Attributions toward artificial agents in a modified Moral Turing Test.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

相似文献

引用本文的文献

本文引用的文献

在改良的道德图灵测试中对人工代理的归因。

Attributions toward artificial agents in a modified Moral Turing Test.

机构信息

出版信息