Suppr超能文献

大语言模型与反向图灵测试。

Large Language Models and the Reverse Turing Test.

机构信息

Salk Institute for Biological Studies, La Jolla, CA 92093, U.S.A.

Division of Biological Sciences, University of California, San Diego, La Jolla, CA 92037, U.S.A.

出版信息

Neural Comput. 2023 Feb 17;35(3):309-342. doi: 10.1162/neco_a_01563.

Abstract

Large language models (LLMs) have been transformative. They are pretrained foundational models that are self-supervised and can be adapted with fine-tuning to a wide range of natural language tasks, each of which previously would have required a separate network model. This is one step closer to the extraordinary versatility of human language. GPT-3 and, more recently, LaMDA, both of them LLMs, can carry on dialogs with humans on many topics after minimal priming with a few examples. However, there has been a wide range of reactions and debate on whether these LLMs understand what they are saying or exhibit signs of intelligence. This high variance is exhibited in three interviews with LLMs reaching wildly different conclusions. A new possibility was uncovered that could explain this divergence. What appears to be intelligence in LLMs may in fact be a mirror that reflects the intelligence of the interviewer, a remarkable twist that could be considered a reverse Turing test. If so, then by studying interviews, we may be learning more about the intelligence and beliefs of the interviewer than the intelligence of the LLMs. As LLMs become more capable, they may transform the way we interact with machines and how they interact with each other. Increasingly, LLMs are being coupled with sensorimotor devices. LLMs can talk the talk, but can they walk the walk? A road map for achieving artificial general autonomy is outlined with seven major improvements inspired by brain systems and how LLMs could in turn be used to uncover new insights into brain function.

摘要

大型语言模型(LLMs)具有变革性。它们是经过预训练的基础模型,可以进行自我监督,并通过微调适应广泛的自然语言任务,而之前每个任务都需要一个单独的网络模型。这更接近人类语言的非凡通用性。GPT-3 和最近的 LaMDA 都是 LLM,可以在经过少量示例的初步提示后,与人类就许多话题进行对话。然而,对于这些 LLM 是否理解它们所说的内容或表现出智能迹象,人们的反应和争论不一。这在对三个 LLM 的采访中表现出了极大的差异,得出了截然不同的结论。一个新的可能性被揭示出来,可以解释这种分歧。在 LLM 中表现出的智能实际上可能是一种反映采访者智能的镜子,这是一个引人注目的转折,可以被认为是一种反向图灵测试。如果是这样,那么通过研究采访,我们可能会更多地了解采访者的智能和信念,而不是 LLM 的智能。随着 LLM 变得越来越强大,它们可能会改变我们与机器交互的方式以及它们相互交互的方式。越来越多的 LLM 正在与传感器和执行器设备结合使用。LLM 可以说会道,但它们能付诸行动吗?通过借鉴大脑系统和 LLM 如何反过来被用来揭示大脑功能的新见解,为实现人工通用自主性制定了一个路线图,提出了七个主要的改进。

相似文献

1
Large Language Models and the Reverse Turing Test.
Neural Comput. 2023 Feb 17;35(3):309-342. doi: 10.1162/neco_a_01563.
4
Comparative Evaluation of LLMs in Clinical Oncology.
NEJM AI. 2024 May;1(5). doi: 10.1056/aioa2300151. Epub 2024 Apr 16.
6
Deep learning-based natural language processing for detecting medical symptoms and histories in emergency patient triage.
Am J Emerg Med. 2024 Mar;77:29-38. doi: 10.1016/j.ajem.2023.11.063. Epub 2023 Dec 10.
7
Understanding natural language: Potential application of large language models to ophthalmology.
Asia Pac J Ophthalmol (Phila). 2024 Jul-Aug;13(4):100085. doi: 10.1016/j.apjo.2024.100085. Epub 2024 Jul 25.
9
Artificial Intelligence for Anesthesiology Board-Style Examination Questions: Role of Large Language Models.
J Cardiothorac Vasc Anesth. 2024 May;38(5):1251-1259. doi: 10.1053/j.jvca.2024.01.032. Epub 2024 Feb 1.

引用本文的文献

1
The role of trustworthy and reliable AI for multiple sclerosis.
Front Digit Health. 2025 Mar 24;7:1507159. doi: 10.3389/fdgth.2025.1507159. eCollection 2025.
2
Augmenting Community Nursing Practice With Generative AI: A Formative Study of Diagnostic Synergies Using Simulation-Based Clinical Cases.
J Prim Care Community Health. 2025 Jan-Dec;16:21501319251326663. doi: 10.1177/21501319251326663. Epub 2025 Mar 25.
4
Transformers and cortical waves: encoders for pulling in context across time.
Trends Neurosci. 2024 Oct;47(10):788-802. doi: 10.1016/j.tins.2024.08.006. Epub 2024 Sep 27.
5
Embodiment and agency in a digital world.
Front Psychol. 2024 Sep 6;15:1392949. doi: 10.3389/fpsyg.2024.1392949. eCollection 2024.
6
Active inference goes to school: the importance of active learning in the age of large language models.
Philos Trans R Soc Lond B Biol Sci. 2024 Oct 7;379(1911):20230148. doi: 10.1098/rstb.2023.0148. Epub 2024 Aug 19.
7
Eight challenges in developing theory of intelligence.
Front Comput Neurosci. 2024 Jul 24;18:1388166. doi: 10.3389/fncom.2024.1388166. eCollection 2024.
10
The new frontier: utilizing ChatGPT to expand craniofacial research.
Arch Craniofac Surg. 2024 Jun;25(3):116-122. doi: 10.7181/acfs.2024.00115. Epub 2024 Jun 20.

本文引用的文献

2
Language models, like humans, show content effects on reasoning tasks.
PNAS Nexus. 2024 Jul 16;3(7):pgae233. doi: 10.1093/pnasnexus/pgae233. eCollection 2024 Jul.
3
Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation with Large Language Models.
IEEE Trans Vis Comput Graph. 2023 Jan;29(1):1146-1156. doi: 10.1109/TVCG.2022.3209479. Epub 2022 Dec 16.
4
From motor control to team play in simulated humanoid football.
Sci Robot. 2022 Aug 31;7(69):eabo0235. doi: 10.1126/scirobotics.abo0235.
5
Evolutionary loss of complexity in human vocal anatomy as an adaptation for speech.
Science. 2022 Aug 12;377(6607):760-763. doi: 10.1126/science.abm1574. Epub 2022 Aug 11.
6
The application of artificial intelligence to biology and neuroscience.
Cell. 2022 Jul 21;185(15):2640-2643. doi: 10.1016/j.cell.2022.06.047.
7
Intuitive physics learning in a deep-learning model inspired by developmental psychology.
Nat Hum Behav. 2022 Sep;6(9):1257-1267. doi: 10.1038/s41562-022-01394-8. Epub 2022 Jul 11.
8
Theory of the Multiregional Neocortex: Large-Scale Neural Dynamics and Distributed Cognition.
Annu Rev Neurosci. 2022 Jul 8;45:533-560. doi: 10.1146/annurev-neuro-110920-035434.
9
Brain-inspired computing needs a master plan.
Nature. 2022 Apr;604(7905):255-260. doi: 10.1038/s41586-021-04362-w. Epub 2022 Apr 13.
10
BRAIN 2.0: Transforming neuroscience.
Cell. 2022 Jan 6;185(1):4-8. doi: 10.1016/j.cell.2021.11.037.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验