Centre for Language Studies, Radboud University, Nijmegen, the Netherlands.
PLoS Biol. 2009 Nov;7(11):e1000241. doi: 10.1371/journal.pbio.1000241. Epub 2009 Nov 17.
The region of the ancient Sahul continent (present day Australia and New Guinea, and surrounding islands) is home to extreme linguistic diversity. Even apart from the huge Austronesian language family, which spread into the area after the breakup of the Sahul continent in the Holocene, there are hundreds of languages from many apparently unrelated families. On each of the subcontinents, the generally accepted classification recognizes one large, widespread family and a number of unrelatable smaller families. If these language families are related to each other, it is at a depth which is inaccessible to standard linguistic methods. We have inferred the history of structural characteristics of these languages under an admixture model, using a Bayesian algorithm originally developed to discover populations on the basis of recombining genetic markers. This analysis identifies 10 ancestral language populations, some of which can be identified with clearly defined phylogenetic groups. The results also show traces of early dispersals, including hints at ancient connections between Australian languages and some Papuan groups (long hypothesized, never before demonstrated). Systematic language contact effects between members of big phylogenetic groups are also detected, which can in some cases be identified with a diffusional or substrate signal. Most interestingly, however, there remains striking evidence of a phylogenetic signal, with many languages showing negligible amounts of admixture.
古老的萨赫尔大陆(现今的澳大利亚和新几内亚以及周边岛屿)地区拥有极端多样的语言。即使不考虑在全新世萨赫尔大陆分裂后传入该地区的庞大南岛语系,还有来自许多明显无关家族的数百种语言。在每个次大陆上,普遍认可的分类法都承认一个大型、广泛分布的家族和一些没有关联的较小家族。如果这些语言家族彼此相关,那么这是在标准语言方法无法企及的深度上。我们在混合模型下推断了这些语言结构特征的历史,使用了一种最初为根据重组遗传标记发现人群而开发的贝叶斯算法。该分析确定了 10 个祖先语言群体,其中一些可以与明确界定的系统发育群体相对应。结果还显示了早期扩散的痕迹,包括澳大利亚语言与一些巴布亚语群体(长期假设,从未得到证明)之间的古代联系的暗示。还检测到了大系统发育群体成员之间的系统性语言接触效应,在某些情况下,可以识别出扩散或基质信号。然而,最有趣的是,仍然存在着明显的系统发育信号,许多语言的混合程度可以忽略不计。