Centre for Society and Genetics, University of California, Los Angeles, 90095-722, USA.
BMC Evol Biol. 2010 Mar 31;10:92. doi: 10.1186/1471-2148-10-92.
The Cross River region in Nigeria is an extremely diverse area linguistically with over 60 distinct languages still spoken today. It is also a region of great historical importance, being a) adjacent to the likely homeland from which Bantu-speaking people migrated across most of sub-Saharan Africa 3000-5000 years ago and b) the location of Calabar, one of the largest centres during the Atlantic slave trade. Over 1000 DNA samples from 24 clans representing speakers of the six most prominent languages in the region were collected and typed for Y-chromosome (SNPs and microsatellites) and mtDNA markers (Hypervariable Segment 1) in order to examine whether there has been substantial gene flow between groups speaking different languages in the region. In addition the Cross River region was analysed in the context of a larger geographical scale by comparison to bordering Igbo speaking groups as well as neighbouring Cameroon populations and more distant Ghanaian communities.
The Cross River region was shown to be extremely homogenous for both Y-chromosome and mtDNA markers with language spoken having no noticeable effect on the genetic structure of the region, consistent with estimates of inter-language gene flow of 10% per generation based on sociological data. However the groups in the region could clearly be differentiated from others in Cameroon and Ghana (and to a lesser extent Igbo populations). Significant correlations between genetic distance and both geographic and linguistic distance were observed at this larger scale.
Previous studies have found significant correlations between genetic variation and language in Africa over large geographic distances, often across language families. However the broad sampling strategies of these datasets have limited their utility for understanding the relationship within language families. This is the first study to show that at very fine geographic/linguistic scales language differences can be maintained in the presence of substantial gene flow over an extended period of time and demonstrates the value of dense sampling strategies and having DNA of known and detailed provenance, a practice that is generally rare when investigating sub-Saharan African demographic processes using genetic data.
尼日利亚的克罗斯河地区在语言上是一个极其多样化的地区,目前仍有 60 多种不同的语言在使用。它也是一个具有重要历史意义的地区,原因有二:a)毗邻班图人在 3000-5000 年前穿越撒哈拉以南非洲大部分地区的可能发源地;b)卡拉巴尔的所在地,这是大西洋奴隶贸易时期最大的中心之一。为了研究该地区讲不同语言的群体之间是否存在大量基因流动,从该地区六种主要语言的 24 个部落收集了超过 1000 个 DNA 样本,并对 Y 染色体(SNP 和微卫星)和 mtDNA 标记(高变区 1)进行了分型。此外,还将克罗斯河地区与毗邻的伊博族群体以及邻国喀麦隆人口和更远的加纳社区进行了更大地理尺度的比较分析。
克罗斯河地区的 Y 染色体和 mtDNA 标记都非常同质,语言对该地区的遗传结构没有明显影响,这与基于社会学数据估计的每代 10%的语言间基因流动一致。然而,与喀麦隆和加纳(以及较小程度的伊博族群体)的群体相比,该地区的群体可以清楚地区分。在更大的尺度上观察到遗传距离与地理和语言距离之间存在显著相关性。
以前的研究发现,在非洲,遗传变异与语言之间存在显著相关性,其地理距离往往跨越语言家族。然而,这些数据集的广泛采样策略限制了它们在理解语言家族内部关系方面的实用性。这是第一项表明在非常精细的地理/语言尺度上,在较长时间内存在大量基因流动的情况下,可以保持语言差异的研究,并证明了密集采样策略和具有已知和详细来源的 DNA 的价值,这种做法在使用遗传数据研究撒哈拉以南非洲人口过程时通常很少见。