Suppr超能文献

作为句法约束处理认知模型的神经网络

Neural Networks as Cognitive Models of the Processing of Syntactic Constraints.

作者信息

Arehalli Suhas, Linzen Tal

机构信息

Department of Mathematics, Statistics, and Computer Science, Macalester College, Saint Paul, MN, USA.

Department of Linguistics and Center for Data Science, New York University, New York, NY, USA.

出版信息

Open Mind (Camb). 2024 May 6;8:558-614. doi: 10.1162/opmi_a_00137. eCollection 2024.

Abstract

Languages are governed by -structural rules that determine which sentences are grammatical in the language. In English, one such constraint is , which dictates that the number of a verb must match the number of its corresponding subject: "the dog run", but "the dog run". While this constraint appears to be simple, in practice speakers make agreement errors, particularly when a noun phrase near the verb differs in number from the subject (for example, a speaker might produce the ungrammatical sentence "the key to the cabinets are rusty"). This phenomenon, referred to as , is sensitive to a wide range of properties of the sentence; no single existing model is able to generate predictions for the wide variety of materials studied in the human experimental literature. We explore the viability of neural network language models-broad-coverage systems trained to predict the next word in a corpus-as a framework for addressing this limitation. We analyze the agreement errors made by Long Short-Term Memory (LSTM) networks and compare them to those of humans. The models successfully simulate certain results, such as the so-called number asymmetry and the difference between attraction strength in grammatical and ungrammatical sentences, but failed to simulate others, such as the effect of syntactic distance or notional (conceptual) number. We further evaluate networks trained with explicit syntactic supervision, and find that this form of supervision does not always lead to more human-like syntactic behavior. Finally, we show that the corpus used to train a network significantly affects the pattern of agreement errors produced by the network, and discuss the strengths and limitations of neural networks as a tool for understanding human syntactic processing.

摘要

语言受结构规则支配,这些规则决定了语言中哪些句子是合乎语法的。在英语中,这样一种限制是主谓一致,它规定动词的数必须与其相应主语的数相匹配:“狗跑”,但应该是“狗跑”(此处原文有误,正确应该是“the dog runs”)。虽然这种限制看似简单,但在实际中,说话者会出现一致错误,尤其是当动词附近的名词短语在数上与主语不同时(例如,说话者可能会说出不合语法的句子“橱柜的钥匙生锈了”)。这种现象,被称为主谓一致错误,对句子的多种属性敏感;现有的单一模型都无法对人类实验文献中研究的各种材料做出预测。我们探讨神经网络语言模型(一种经过训练以预测语料库中下一个单词的广泛覆盖系统)作为解决这一限制的框架的可行性。我们分析了长短期记忆(LSTM)网络所犯的一致错误,并将它们与人类的错误进行比较。这些模型成功地模拟了某些结果,比如所谓的数不对称以及合语法和不合语法句子中吸引强度的差异,但未能模拟其他结果,比如句法距离或概念数的影响。我们进一步评估了经过明确句法监督训练的网络,发现这种监督形式并不总是能导致更像人类的句法行为。最后,我们表明用于训练网络的语料库会显著影响网络产生的一致错误模式,并讨论了神经网络作为理解人类句法处理工具的优势和局限性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e644/11093404/f17afc230526/opmi-08-558-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验