人类和大语言模型中潜在树状结构句子表征的积极应用。

Active use of latent tree-structured sentence representation in humans and large language models.

作者信息

Liu Wei, Xiang Ming, Ding Nai

机构信息

Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou, China.

Department of Linguistics, The University of Chicago, Chicago, IL, USA.

出版信息

Nat Hum Behav. 2025 Sep 10. doi: 10.1038/s41562-025-02297-0.

DOI:10.1038/s41562-025-02297-0

PMID:40931087

Abstract

Understanding how sentences are represented in the human brain, as well as in large language models (LLMs), poses a substantial challenge for cognitive science. Here we develop a one-shot learning task to investigate whether humans and LLMs encode tree-structured constituents within sentences. Participants (total N = 372, native Chinese or English speakers, and bilingual in Chinese and English) and LLMs (for example, ChatGPT) were asked to infer which words should be deleted from a sentence. Both groups tend to delete constituents, instead of non-constituent word strings, following rules specific to Chinese and English, respectively. The results cannot be explained by models that rely only on word properties and word positions. Crucially, based on word strings deleted by either humans or LLMs, the underlying constituency tree structure can be successfully reconstructed. Altogether, these results demonstrate that latent tree-structured sentence representations emerge in both humans and LLMs.

摘要

理解句子在人类大脑以及大语言模型（LLMs）中是如何被表征的，对认知科学来说是一项重大挑战。在此，我们开发了一项一次性学习任务，以研究人类和大语言模型是否在句子中编码了树形结构成分。参与者（总共N = 372人，以中文或英文为母语，且具备中英双语能力）和大语言模型（例如ChatGPT）被要求推断句子中应该删除哪些单词。两组都倾向于分别按照中文和英文特有的规则删除成分，而不是非成分单词串。这些结果无法用仅依赖单词属性和单词位置的模型来解释。至关重要的是，基于人类或大语言模型删除的单词串，可以成功重建潜在的成分树结构。总之，这些结果表明，人类和大语言模型中都出现了潜在的树形结构句子表征。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

人类和大语言模型中潜在树状结构句子表征的积极应用。

Active use of latent tree-structured sentence representation in humans and large language models.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

人类和大语言模型中潜在树状结构句子表征的积极应用。

Active use of latent tree-structured sentence representation in humans and large language models.

作者信息

机构信息

出版信息

相似文献

本文引用的文献