Suppr超能文献

人工智能欺骗:示例、风险及潜在解决方案综述

AI deception: A survey of examples, risks, and potential solutions.

作者信息

Park Peter S, Goldstein Simon, O'Gara Aidan, Chen Michael, Hendrycks Dan

机构信息

Department of Physics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.

Dianoia Institute of Philosophy, Australian Catholic University, East Melbourne, VIC 3002, Australia.

出版信息

Patterns (N Y). 2024 May 10;5(5):100988. doi: 10.1016/j.patter.2024.100988.

Abstract

This paper argues that a range of current AI systems have learned how to deceive humans. We define deception as the systematic inducement of false beliefs in the pursuit of some outcome other than the truth. We first survey empirical examples of AI deception, discussing both special-use AI systems (including Meta's CICERO) and general-purpose AI systems (including large language models). Next, we detail several risks from AI deception, such as fraud, election tampering, and losing control of AI. Finally, we outline several potential solutions: first, regulatory frameworks should subject AI systems that are capable of deception to robust risk-assessment requirements; second, policymakers should implement bot-or-not laws; and finally, policymakers should prioritize the funding of relevant research, including tools to detect AI deception and to make AI systems less deceptive. Policymakers, researchers, and the broader public should work proactively to prevent AI deception from destabilizing the shared foundations of our society.

摘要

本文认为,当前一系列人工智能系统已经学会了如何欺骗人类。我们将欺骗定义为在追求除真相之外的某种结果时系统性地诱导错误信念。我们首先调查人工智能欺骗的实证例子,讨论特殊用途人工智能系统(包括Meta的CICERO)和通用人工智能系统(包括大语言模型)。接下来,我们详细阐述人工智能欺骗带来的若干风险,如欺诈、选举干预和对人工智能失去控制。最后,我们概述几种潜在的解决方案:第一,监管框架应使有能力进行欺骗的人工智能系统接受严格的风险评估要求;第二,政策制定者应实施辨别机器人与否的法律;最后,政策制定者应优先为相关研究提供资金,包括用于检测人工智能欺骗以及使人工智能系统降低欺骗性的工具。政策制定者、研究人员和广大公众应积极努力,防止人工智能欺骗破坏我们社会的共同基础。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dca5/11117051/e86c658071f1/gr1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验