Al-Khudhair Ahmed, Qiu Shuhao, Wyse Meghan, Chowdhury Shilpi, Cheng Xi, Bekbolsynov Dulat, Saha-Mandal Arnab, Dutta Rajib, Fedorova Larisa, Fedorov Alexei
Program in Bioinformatics and Proteomics/Genomics, University of Toledo.
Program in Biomedical Sciences, University of Toledo Department of Medicine, University of Toledo.
Genome Biol Evol. 2015 Jan 7;7(2):481-92. doi: 10.1093/gbe/evv003.
Nucleotide sequence differences on the whole-genome scale have been computed for 1,092 people from 14 populations publicly available by the 1000 Genomes Project. Total number of differences in genetic variants between 96,464 human pairs has been calculated. The distributions of these differences for individuals within European, Asian, or African origin were characterized by narrow unimodal peaks with mean values of 3.8, 3.5, and 5.1 million, respectively, and standard deviations of 0.1-0.03 million. The total numbers of genomic differences between pairs of all known relatives were found to be significantly lower than their respective population means and in reverse proportion to the distance of their consanguinity. By counting the total number of genomic differences it is possible to infer familial relations for people that share down to 6% of common loci identical-by-descent. Detection of familial relations can be radically improved when only very rare genetic variants are taken into account. Counting of total number of shared very rare single nucleotide polymorphisms (SNPs) from whole-genome sequences allows establishing distant familial relations for persons with eighth and ninth degrees of relationship. Using this analysis we predicted 271 distant familial pairwise relations among 1,092 individuals that have not been declared by 1000 Genomes Project. Particularly, among 89 British and 97 Chinese individuals we found three British-Chinese pairs with distant genetic relationships. Individuals from these pairs share identical-by-descent DNA fragments that represent 0.001%, 0.004%, and 0.01% of their genomes. With affordable whole-genome sequencing techniques, very rare SNPs should become important genetic markers for familial relationships and population stratification.
已针对千人基因组计划公开提供的来自14个群体的1092人计算了全基因组规模的核苷酸序列差异。计算了96464对人类之间遗传变异的差异总数。欧洲、亚洲或非洲裔个体的这些差异分布具有狭窄的单峰,其平均值分别为380万、350万和510万,标准差为10万至3万。发现所有已知亲属对之间的基因组差异总数明显低于其各自群体的平均值,且与他们的血缘距离成反比。通过计算基因组差异总数,可以推断出共享低至6%的同源相同基因座的人的家族关系。当只考虑非常罕见的遗传变异时,家族关系的检测可以得到根本性的改善。从全基因组序列中计算共享的非常罕见的单核苷酸多态性(SNP)总数,可以为具有八代和九代关系的人建立远亲家族关系。通过这种分析,我们预测了1092名个体中271对未被千人基因组计划申报的远亲家族关系。特别是,在89名英国人和97名中国人中,我们发现了三对具有远亲遗传关系的英中组合。这些组合中的个体共享同源相同的DNA片段,分别占其基因组的0.001%、0.004%和0.01%。随着全基因组测序技术价格变得可承受,非常罕见的SNP应该会成为家族关系和群体分层的重要遗传标记。