Department of Language Science, University of California, Irvine, CA 92617.
Proc Natl Acad Sci U S A. 2023 Sep 26;120(39):e2220593120. doi: 10.1073/pnas.2220593120. Epub 2023 Sep 19.
I apply a recently emerging perspective on the complexity of action selection, the rate-distortion theory of control, to provide a computational-level model of errors and difficulties in human language production, which is grounded in information theory and control theory. Language production is cast as the sequential selection of actions to achieve a communicative goal subject to a capacity constraint on cognitive control. In a series of calculations, simulations, corpus analyses, and comparisons to experimental data, I show that the model directly predicts some of the major known qualitative and quantitative phenomena in language production, including semantic interference and predictability effects in word choice; accessibility-based ("easy-first") production preferences in word order alternations; and the existence and distribution of disfluencies including filled pauses, corrections, and false starts. I connect the rate-distortion view to existing models of human language production, to probabilistic models of semantics and pragmatics, and to proposals for controlled language generation in the machine learning and reinforcement learning literature.
我应用了一种新兴的关于动作选择复杂性的观点,即控制的率失真理论,为人类语言产生中的错误和困难提供了一个基于信息论和控制论的计算水平模型。语言产生被视为在认知控制的容量约束下,为实现交际目标而进行的动作序列选择。在一系列的计算、模拟、语料库分析以及与实验数据的比较中,我表明该模型直接预测了语言产生中的一些主要的已知定性和定量现象,包括选词中的语义干扰和可预测性效应;语序交替中基于可及性的(“先易后难”)生成偏好;以及不流畅性的存在和分布,包括填充停顿、纠正和错误开始。我将率失真观点与现有的人类语言产生模型、语义和语用的概率模型以及机器学习和强化学习文献中关于受控语言生成的建议联系起来。