大语言模型与反向图灵测试。

Large Language Models and the Reverse Turing Test.

机构信息

Salk Institute for Biological Studies, La Jolla, CA 92093, U.S.A.

Division of Biological Sciences, University of California, San Diego, La Jolla, CA 92037, U.S.A.

出版信息

Neural Comput. 2023 Feb 17;35(3):309-342. doi: 10.1162/neco_a_01563.

DOI:10.1162/neco_a_01563

PMID:36746144

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10177005/

Abstract

Large language models (LLMs) have been transformative. They are pretrained foundational models that are self-supervised and can be adapted with fine-tuning to a wide range of natural language tasks, each of which previously would have required a separate network model. This is one step closer to the extraordinary versatility of human language. GPT-3 and, more recently, LaMDA, both of them LLMs, can carry on dialogs with humans on many topics after minimal priming with a few examples. However, there has been a wide range of reactions and debate on whether these LLMs understand what they are saying or exhibit signs of intelligence. This high variance is exhibited in three interviews with LLMs reaching wildly different conclusions. A new possibility was uncovered that could explain this divergence. What appears to be intelligence in LLMs may in fact be a mirror that reflects the intelligence of the interviewer, a remarkable twist that could be considered a reverse Turing test. If so, then by studying interviews, we may be learning more about the intelligence and beliefs of the interviewer than the intelligence of the LLMs. As LLMs become more capable, they may transform the way we interact with machines and how they interact with each other. Increasingly, LLMs are being coupled with sensorimotor devices. LLMs can talk the talk, but can they walk the walk? A road map for achieving artificial general autonomy is outlined with seven major improvements inspired by brain systems and how LLMs could in turn be used to uncover new insights into brain function.

摘要

大型语言模型（LLMs）具有变革性。它们是经过预训练的基础模型，可以进行自我监督，并通过微调适应广泛的自然语言任务，而之前每个任务都需要一个单独的网络模型。这更接近人类语言的非凡通用性。GPT-3 和最近的 LaMDA 都是 LLM，可以在经过少量示例的初步提示后，与人类就许多话题进行对话。然而，对于这些 LLM 是否理解它们所说的内容或表现出智能迹象，人们的反应和争论不一。这在对三个 LLM 的采访中表现出了极大的差异，得出了截然不同的结论。一个新的可能性被揭示出来，可以解释这种分歧。在 LLM 中表现出的智能实际上可能是一种反映采访者智能的镜子，这是一个引人注目的转折，可以被认为是一种反向图灵测试。如果是这样，那么通过研究采访，我们可能会更多地了解采访者的智能和信念，而不是 LLM 的智能。随着 LLM 变得越来越强大，它们可能会改变我们与机器交互的方式以及它们相互交互的方式。越来越多的 LLM 正在与传感器和执行器设备结合使用。LLM 可以说会道，但它们能付诸行动吗？通过借鉴大脑系统和 LLM 如何反过来被用来揭示大脑功能的新见解，为实现人工通用自主性制定了一个路线图，提出了七个主要的改进。

相似文献

Large Language Models and the Reverse Turing Test.

Neural Comput. 2023 Feb 17;35(3):309-342. doi: 10.1162/neco_a_01563.

Assessing the Alignment of Large Language Models With Human Values for Mental Health Integration: Cross-Sectional Study Using Schwartz's Theory of Basic Values.

JMIR Ment Health. 2024 Apr 9;11:e55988. doi: 10.2196/55988.

Learning to Make Rare and Complex Diagnoses With Generative AI Assistance: Qualitative Study of Popular Large Language Models.

JMIR Med Educ. 2024 Feb 13;10:e51391. doi: 10.2196/51391.

Comparative Evaluation of LLMs in Clinical Oncology.

NEJM AI. 2024 May;1(5). doi: 10.1056/aioa2300151. Epub 2024 Apr 16.

Evaluating Large Language Models for the National Premedical Exam in India: Comparative Analysis of GPT-3.5, GPT-4, and Bard.

JMIR Med Educ. 2024 Feb 21;10:e51523. doi: 10.2196/51523.

Deep learning-based natural language processing for detecting medical symptoms and histories in emergency patient triage.

Am J Emerg Med. 2024 Mar;77:29-38. doi: 10.1016/j.ajem.2023.11.063. Epub 2023 Dec 10.

Understanding natural language: Potential application of large language models to ophthalmology.

Asia Pac J Ophthalmol (Phila). 2024 Jul-Aug;13(4):100085. doi: 10.1016/j.apjo.2024.100085. Epub 2024 Jul 25.

Quality of Answers of Generative Large Language Models Versus Peer Users for Interpreting Laboratory Test Results for Lay Patients: Evaluation Study.

J Med Internet Res. 2024 Apr 17;26:e56655. doi: 10.2196/56655.

Artificial Intelligence for Anesthesiology Board-Style Examination Questions: Role of Large Language Models.

J Cardiothorac Vasc Anesth. 2024 May;38(5):1251-1259. doi: 10.1053/j.jvca.2024.01.032. Epub 2024 Feb 1.

Large Language Models and User Trust: Consequence of Self-Referential Learning Loop and the Deskilling of Health Care Professionals.

J Med Internet Res. 2024 Apr 25;26:e56764. doi: 10.2196/56764.

引用本文的文献

The role of trustworthy and reliable AI for multiple sclerosis.

Front Digit Health. 2025 Mar 24;7:1507159. doi: 10.3389/fdgth.2025.1507159. eCollection 2025.

Augmenting Community Nursing Practice With Generative AI: A Formative Study of Diagnostic Synergies Using Simulation-Based Clinical Cases.

J Prim Care Community Health. 2025 Jan-Dec;16:21501319251326663. doi: 10.1177/21501319251326663. Epub 2025 Mar 25.

Deep learning application in prediction of cancer molecular alterations based on pathological images: a bibliographic analysis via CiteSpace.

J Cancer Res Clin Oncol. 2024 Oct 18;150(10):467. doi: 10.1007/s00432-024-05992-z.

Transformers and cortical waves: encoders for pulling in context across time.

Trends Neurosci. 2024 Oct;47(10):788-802. doi: 10.1016/j.tins.2024.08.006. Epub 2024 Sep 27.

Embodiment and agency in a digital world.

Front Psychol. 2024 Sep 6;15:1392949. doi: 10.3389/fpsyg.2024.1392949. eCollection 2024.

Active inference goes to school: the importance of active learning in the age of large language models.

Philos Trans R Soc Lond B Biol Sci. 2024 Oct 7;379(1911):20230148. doi: 10.1098/rstb.2023.0148. Epub 2024 Aug 19.

Eight challenges in developing theory of intelligence.

Front Comput Neurosci. 2024 Jul 24;18:1388166. doi: 10.3389/fncom.2024.1388166. eCollection 2024.

Augmenting intensive care unit nursing practice with generative AI: A formative study of diagnostic synergies using simulation-based clinical cases.

J Clin Nurs. 2024 Aug 5. doi: 10.1111/jocn.17384.

Moving in an Uncertain World: Robust and Adaptive Control of Locomotion from Organisms to Machine Intelligence.

Integr Comp Biol. 2024 Nov 21;64(5):1390-1407. doi: 10.1093/icb/icae121.

The new frontier: utilizing ChatGPT to expand craniofacial research.

Arch Craniofac Surg. 2024 Jun;25(3):116-122. doi: 10.7181/acfs.2024.00115. Epub 2024 Jun 20.

本文引用的文献

Artificial Confidence: Even the newest, buzziest systems of artificial general intelligence are stymied by the same old problems.

Sci Am. 2022 Oct 1;327(4):42. doi: 10.1038/scientificamerican1022-42.

Language models, like humans, show content effects on reasoning tasks.

PNAS Nexus. 2024 Jul 16;3(7):pgae233. doi: 10.1093/pnasnexus/pgae233. eCollection 2024 Jul.

Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation with Large Language Models.

IEEE Trans Vis Comput Graph. 2023 Jan;29(1):1146-1156. doi: 10.1109/TVCG.2022.3209479. Epub 2022 Dec 16.

From motor control to team play in simulated humanoid football.

Sci Robot. 2022 Aug 31;7(69):eabo0235. doi: 10.1126/scirobotics.abo0235.

Evolutionary loss of complexity in human vocal anatomy as an adaptation for speech.

Science. 2022 Aug 12;377(6607):760-763. doi: 10.1126/science.abm1574. Epub 2022 Aug 11.

The application of artificial intelligence to biology and neuroscience.

Cell. 2022 Jul 21;185(15):2640-2643. doi: 10.1016/j.cell.2022.06.047.

Intuitive physics learning in a deep-learning model inspired by developmental psychology.

Nat Hum Behav. 2022 Sep;6(9):1257-1267. doi: 10.1038/s41562-022-01394-8. Epub 2022 Jul 11.

Theory of the Multiregional Neocortex: Large-Scale Neural Dynamics and Distributed Cognition.

Annu Rev Neurosci. 2022 Jul 8;45:533-560. doi: 10.1146/annurev-neuro-110920-035434.

Brain-inspired computing needs a master plan.

Nature. 2022 Apr;604(7905):255-260. doi: 10.1038/s41586-021-04362-w. Epub 2022 Apr 13.

BRAIN 2.0: Transforming neuroscience.

Cell. 2022 Jan 6;185(1):4-8. doi: 10.1016/j.cell.2021.11.037.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr
超能文献

大语言模型与反向图灵测试。

Large Language Models and the Reverse Turing Test.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

Suppr超能文献

大语言模型与反向图灵测试。

Large Language Models and the Reverse Turing Test.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

Suppr
超能文献