Suppr超能文献

社会语言序列中的微观时间聚类来源。

Sources of Microtemporal Clustering in Sociolinguistic Sequences.

作者信息

Tamminga Meredith

机构信息

Department of Linguistics, University of Pennsylvania, Philadelphia, PA, United States.

出版信息

Front Artif Intell. 2019 Jun 20;2:10. doi: 10.3389/frai.2019.00010. eCollection 2019.

Abstract

Persistence is the tendency of speakers to repeat the choice of sociolinguistic variant they have recently made in conversational speech. A longstanding debate is whether this tendency toward repetitiveness reflects the direct influence of one outcome on the next instance of the variable, which I call sequential dependence, or the shared influence of shifting contextual factors on proximal instances of the variable, which I call baseline deflection. I propose that these distinct types of clustering make different predictions for sequences of variable observations that are longer than the typical prime-target pairs of typical corpus persistence studies. In corpus ING data from conversational speech, I show that there are two effects to be accounted for: an effect of how many times the /ing/ variant occurs in the 2, 3, or 4-token sequence prior to the target (regardless of order), and an effect of whether the immediately prior (1-back) token was /ing/. I then build a series of simulations involving Bernoulli trials at sequences of different probabilities that incorporate either a sequential dependence mechanism, a baseline deflection mechanism, or both. I argue that the model incorporating both baseline deflection and sequential dependence is best able to produce simulated data that shares the relevant properties of the corpus data, which is an encouraging outcome because we have independent reasons to expect both baseline deflection and sequential dependence to exist. I conclude that this exploratory analysis of longer sociolinguistic sequences reflects a promising direction for future research on the mechanisms involved in the production of sociolinguistic variation.

摘要

持续性是指说话者在会话中倾向于重复他们最近做出的社会语言学变体选择。一个长期存在的争论是,这种重复性倾向是反映了一个结果对该变量下一个实例的直接影响(我称之为序列依赖性),还是反映了变化的语境因素对该变量相邻实例的共同影响(我称之为基线偏移)。我认为,对于比典型语料库持续性研究中典型的启动-目标对更长的变量观察序列,这些不同类型的聚类会做出不同的预测。在会话语音的语料库ING数据中,我表明有两种效应需要考虑:目标之前的2、3或4词序列中/ing/变体出现的次数(无论顺序如何)的效应,以及紧邻的前一个(向后1步)词是否为/ing/的效应。然后,我构建了一系列模拟,涉及不同概率序列的伯努利试验,这些序列纳入了序列依赖性机制、基线偏移机制或两者。我认为,同时纳入基线偏移和序列依赖性的模型最能够生成与语料库数据具有相关属性的模拟数据,这是一个令人鼓舞的结果,因为我们有独立的理由预期基线偏移和序列依赖性都存在。我得出结论,对更长的社会语言序列的这种探索性分析反映了未来关于社会语言变异产生机制研究的一个有前景的方向。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/efda/7861327/5a5257e67857/frai-02-00010-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验