Suppr超能文献

理解不确定性下言语的语音特征——语言知识表征在学习与加工中的意义

Understanding the Phonetic Characteristics of Speech Under Uncertainty-Implications of the Representation of Linguistic Knowledge in Learning and Processing.

作者信息

Tomaschek Fabian, Ramscar Michael

机构信息

Quantitative Linguistics Lab, Department of General Linguistics, University of Tübingen, Tübingen, Germany.

出版信息

Front Psychol. 2022 Apr 25;13:754395. doi: 10.3389/fpsyg.2022.754395. eCollection 2022.

Abstract

The uncertainty associated with paradigmatic families has been shown to correlate with their phonetic characteristics in speech, suggesting that representations of complex sublexical relations between words are part of speaker knowledge. To better understand this, recent studies have used two-layer neural network models to examine the way paradigmatic uncertainty emerges in learning. However, to date this work has largely ignored the way choices about the representation of inflectional and grammatical functions (IFS) in models strongly influence what they subsequently learn. To explore the consequences of this, we investigate how representations of IFS in the input-output structures of learning models affect the capacity of uncertainty estimates derived from them to account for phonetic variability in speech. Specifically, we examine whether IFS are best represented as outputs to neural networks (as in previous studies) or as inputs by building models that embody both choices and examining their capacity to account for uncertainty effects in the formant trajectories of word final [ɐ], which in German discriminates around sixty different IFS. Overall, we find that formants are enhanced as the uncertainty associated with IFS decreases. This result dovetails with a growing number of studies of morphological and inflectional families that have shown that enhancement is associated with lower uncertainty in context. Importantly, we also find that in models where IFS serve as inputs-as our theoretical analysis suggests they ought to-its uncertainty measures provide better fits to the empirical variance observed in [ɐ] formants than models where IFS serve as outputs. This supports our suggestion that IFS serve as cognitive cues during speech production, and should be treated as such in modeling. It is also consistent with the idea that when IFS serve as inputs to a learning network. This maintains the distinction between those parts of the network that represent message and those that represent signal. We conclude by describing how maintaining a "signal-message-uncertainty distinction" can allow us to reconcile a range of apparently contradictory findings about the relationship between articulation and uncertainty in context.

摘要

与范式家族相关的不确定性已被证明与言语中的语音特征相关,这表明单词之间复杂的次词汇关系表征是说话者知识的一部分。为了更好地理解这一点,最近的研究使用了两层神经网络模型来研究范式不确定性在学习过程中出现的方式。然而,迄今为止,这项工作在很大程度上忽略了模型中关于屈折和语法功能(IFS)表征的选择对它们随后学习内容的强烈影响方式。为了探究其后果,我们研究了学习模型输入输出结构中IFS的表征如何影响从这些模型得出的不确定性估计解释语音中语音变异性的能力。具体而言,我们通过构建体现两种选择的模型并检查它们解释单词末尾[ɐ]共振峰轨迹中不确定性效应的能力,来检验IFS是最好表示为神经网络的输出(如先前研究中那样)还是表示为输入。总体而言,我们发现随着与IFS相关的不确定性降低,共振峰会增强。这一结果与越来越多关于形态和屈折家族的研究相吻合,这些研究表明增强与语境中较低的不确定性相关。重要的是,我们还发现,正如我们的理论分析所表明的那样,在IFS作为输入的模型中,其不确定性度量比IFS作为输出的模型能更好地拟合在[ɐ]共振峰中观察到的经验方差。这支持了我们的观点,即IFS在言语产生过程中充当认知线索,并且在建模中应如此对待。这也与IFS作为学习网络输入时的观点一致。这保持了网络中表示信息的部分与表示信号的部分之间的区别。我们通过描述如何维持“信号 - 信息 - 不确定性区分”来总结,这可以使我们调和一系列关于发音与语境中不确定性之间关系的明显矛盾的发现。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4acb/9083257/6bce7608155a/fpsyg-13-754395-g0001.jpg

相似文献

2
The agreement of phonetic transcriptions between paediatric speech and language therapists transcribing a disordered speech sample.
Int J Lang Commun Disord. 2024 Sep-Oct;59(5):1981-1995. doi: 10.1111/1460-6984.13043. Epub 2024 Jun 8.
3
CiwGAN and fiwGAN: Encoding information in acoustic data to model lexical learning with Generative Adversarial Networks.
Neural Netw. 2021 Jul;139:305-325. doi: 10.1016/j.neunet.2021.03.017. Epub 2021 Mar 19.
4
Attention Differentially Affects Acoustic and Phonetic Feature Encoding in a Multispeaker Environment.
J Neurosci. 2022 Jan 26;42(4):682-691. doi: 10.1523/JNEUROSCI.1455-20.2021. Epub 2021 Dec 10.
5
Learning words' sounds before learning how words sound: 9-month-olds use distinct objects as cues to categorize speech information.
Cognition. 2009 Nov;113(2):234-43. doi: 10.1016/j.cognition.2009.08.010. Epub 2009 Sep 17.
6
Prediction of Agreement and Phonetic Overlap Shape Sublexical Identification.
Lang Speech. 2017 Sep;60(3):356-376. doi: 10.1177/0023830916650714. Epub 2016 May 30.
8
Generative Adversarial Phonology: Modeling Unsupervised Phonetic and Phonological Learning With Neural Networks.
Front Artif Intell. 2020 Jul 8;3:44. doi: 10.3389/frai.2020.00044. eCollection 2020.
9
Paradigmatic enhancement of stem vowels in regular English inflected verb forms.
Morphology (Dordr). 2021;31(2):171-199. doi: 10.1007/s11525-021-09374-w. Epub 2021 Feb 11.
10
Neural Representations of Non-native Speech Reflect Proficiency and Interference from Native Language Knowledge.
J Neurosci. 2024 Jan 3;44(1):e0666232023. doi: 10.1523/JNEUROSCI.0666-23.2023.

本文引用的文献

1
The role of coarticulatory acoustic detail in the perception of verbal inflection.
JASA Express Lett. 2021 Aug;1(8):085201. doi: 10.1121/10.0005761.
2
An exploration of error-driven learning in simple two-layer networks from a discriminative learning perspective.
Behav Res Methods. 2022 Oct;54(5):2221-2251. doi: 10.3758/s13428-021-01711-5. Epub 2022 Jan 14.
4
How children learn to communicate discriminatively.
J Child Lang. 2021 Sep;48(5):984-1022. doi: 10.1017/S0305000921000544.
7
Paradigmatic enhancement of stem vowels in regular English inflected verb forms.
Morphology (Dordr). 2021;31(2):171-199. doi: 10.1007/s11525-021-09374-w. Epub 2021 Feb 11.
8
Using Crowd-Sourced Speech Data to Study Socially Constrained Variation in Nonmodal Phonation.
Front Artif Intell. 2021 Jan 25;3:565682. doi: 10.3389/frai.2020.565682. eCollection 2020.
9
How the Probabilistic Structure of Grammatical Context Shapes Speech.
Entropy (Basel). 2020 Jan 11;22(1):90. doi: 10.3390/e22010090.
10
Order Matters! Influences of Linear Order on Linguistic Category Learning.
Cogn Sci. 2020 Nov;44(11):e12910. doi: 10.1111/cogs.12910.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验