Suppr超能文献

cline、聚类以及研究设计对人类群体结构推断的影响。

Clines, clusters, and the effect of study design on the inference of human population structure.

作者信息

Rosenberg Noah A, Mahajan Saurabh, Ramachandran Sohini, Zhao Chengfeng, Pritchard Jonathan K, Feldman Marcus W

机构信息

Department of Human Genetics, Bioinformatics Program, and the Life Sciences Institute, University of Michigan, Ann Arbor, Michigan, USA.

出版信息

PLoS Genet. 2005 Dec;1(6):e70. doi: 10.1371/journal.pgen.0010070. Epub 2005 Dec 9.

Abstract

Previously, we observed that without using prior information about individual sampling locations, a clustering algorithm applied to multilocus genotypes from worldwide human populations produced genetic clusters largely coincident with major geographic regions. It has been argued, however, that the degree of clustering is diminished by use of samples with greater uniformity in geographic distribution, and that the clusters we identified were a consequence of uneven sampling along genetic clines. Expanding our earlier dataset from 377 to 993 markers, we systematically examine the influence of several study design variables--sample size, number of loci, number of clusters, assumptions about correlations in allele frequencies across populations, and the geographic dispersion of the sample--on the "clusteredness" of individuals. With all other variables held constant, geographic dispersion is seen to have comparatively little effect on the degree of clustering. Examination of the relationship between genetic and geographic distance supports a view in which the clusters arise not as an artifact of the sampling scheme, but from small discontinuous jumps in genetic distance for most population pairs on opposite sides of geographic barriers, in comparison with genetic distance for pairs on the same side. Thus, analysis of the 993-locus dataset corroborates our earlier results: if enough markers are used with a sufficiently large worldwide sample, individuals can be partitioned into genetic clusters that match major geographic subdivisions of the globe, with some individuals from intermediate geographic locations having mixed membership in the clusters that correspond to neighboring regions.

摘要

此前,我们观察到,在不使用关于个体采样地点的先验信息的情况下,应用于全球人类群体多位点基因型的聚类算法所产生的遗传簇在很大程度上与主要地理区域相吻合。然而,有人认为,使用地理分布更均匀的样本会降低聚类程度,而且我们所识别的簇是沿遗传渐变群采样不均的结果。我们将早期数据集从377个标记扩展到993个标记,系统地研究了几个研究设计变量——样本大小、位点数量、簇的数量、关于群体间等位基因频率相关性的假设以及样本的地理分散情况——对个体“聚类性”的影响。在所有其他变量保持不变的情况下,地理分散对聚类程度的影响相对较小。对遗传距离和地理距离之间关系的考察支持了这样一种观点,即这些簇并非采样方案的人为产物,而是由于与地理屏障同一侧的群体对相比,大多数地理屏障两侧的群体对在遗传距离上存在小的不连续跳跃。因此,对993位点数据集的分析证实了我们早期的结果:如果在足够大的全球样本中使用足够多的标记,个体可以被划分为与全球主要地理分区相匹配的遗传簇,一些来自中间地理位置的个体在对应于相邻区域的簇中具有混合成员身份。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a8c2/1342627/58bb2eff5fad/pgen.0010070.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验