Linguistics, Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, 04103 Leipzig, Germany.
Philos Trans R Soc Lond B Biol Sci. 2010 Dec 12;365(1559):3829-43. doi: 10.1098/rstb.2010.0099.
Linguists have traditionally represented patterns of divergence within a language family in terms of either a 'splits' model, corresponding to a branching family tree structure, or the wave model, resulting in a (dialect) continuum. Recent phylogenetic analyses, however, have tended to assume the former as a viable idealization also for the latter. But the contrast matters, for it typically reflects different processes in the real world: speaker populations either separated by migrations, or expanding over continuous territory. Since history often leaves a complex of both patterns within the same language family, ideally we need a single model to capture both, and tease apart the respective contributions of each. The 'network' type of phylogenetic method offers this, so we review recent applications to language data. Most have used lexical data, encoded as binary or multi-state characters. We look instead at continuous distance measures of divergence in phonetics. Our output networks combine branch- and continuum-like signals in ways that correspond well to known histories (illustrated for Germanic, and particularly English). We thus challenge the traditional insistence on shared innovations, setting out a new, principled explanation for why complex language histories can emerge correctly from distance measures, despite shared retentions and parallel innovations.
语言学家传统上用“分支”模型或波浪模型来表示语言家族内部的差异模式,前者对应于分支家族树结构,后者则产生(方言)连续统。然而,最近的系统发育分析倾向于假设前者也是后者的可行理想化。但这种对比很重要,因为它通常反映了现实世界中的不同过程:说话人群要么因迁移而分离,要么在连续的地域上扩张。由于历史往往在同一语言家族中留下了两种模式的复杂混合,因此理想情况下,我们需要一个单一的模型来同时捕捉这两种模式,并梳理出它们各自的贡献。“网络”类型的系统发育方法提供了这种可能性,因此我们回顾了最近应用于语言数据的方法。大多数方法都使用词汇数据,这些数据被编码为二进制或多状态字符。相反,我们研究语音中差异的连续距离度量。我们的输出网络以与已知历史相吻合的方式组合了分支和连续信号(以日耳曼语,特别是英语为例进行了说明)。因此,我们挑战了传统上对共享创新的坚持,提出了一个新的、有原则的解释,说明为什么复杂的语言历史可以从距离度量中正确地出现,尽管存在共享的保留和并行创新。