Department of Human Genetics, University of Utah.
Department of Molecular Biology and Genetics, Cornell University.
Genome Biol Evol. 2020 Jun 1;12(6):779-794. doi: 10.1093/gbe/evaa086.
Ongoing retrotransposition of Alu, LINE-1, and SINE-VNTR-Alu elements generates diversity and variation among human populations. Previous analyses investigating the population genetics of mobile element insertions (MEIs) have been limited by population ascertainment bias or by relatively small numbers of populations and low sequencing coverage. Here, we use 296 individuals representing 142 global populations from the Simons Genome Diversity Project (SGDP) to discover and characterize MEI diversity from deeply sequenced whole-genome data. We report 5,742 MEIs not originally reported by the 1000 Genomes Project and show that high sampling diversity leads to a 4- to 7-fold increase in MEI discovery rates over the original 1000 Genomes Project data. As a result of negative selection, nonreference polymorphic MEIs are underrepresented within genes, and MEIs within genes are often found in the transcriptional orientation opposite that of the gene. Globally, 80% of Alu subfamilies predate the expansion of modern humans from Africa. Polymorphic MEIs show heterozygosity gradients that decrease from Africa to Eurasia to the Americas, and the number of MEIs found uniquely in a single individual are also distributed in this general pattern. The maximum fraction of MEI diversity partitioned among the seven major SGDP population groups (FST) is 7.4%, similar to, but slightly lower than, previous estimates and likely attributable to the diverse sampling strategy of the SGDP. Finally, we utilize these MEIs to extrapolate the primary Native American shared ancestry component to back to Asia and provide new evidence from genome-wide identical-by-descent genetic markers that add additional support for a southeastern Siberian origin for most Native Americans.
Alu、LINE-1 和 SINE-VNTR-Alu 元件的持续反转录转座导致了人类群体之间的多样性和变异。之前研究移动元件插入(MEI)群体遗传学的分析受到了人群确定偏差或相对较少的人群和低测序覆盖度的限制。在这里,我们使用来自西蒙斯基因组多样性项目(SGDP)的 296 个人代表了 142 个全球人群,从深度测序的全基因组数据中发现和描述 MEI 多样性。我们报告了 5742 个 1000 基因组计划未报告的 MEI,并表明高采样多样性导致 MEI 发现率比原始 1000 基因组计划数据增加了 4 到 7 倍。由于负选择,非参考多态性 MEI 在基因内的代表性不足,而基因内的 MEI 通常位于与基因相反的转录方向上。在全球范围内,80%的 Alu 亚家族早于现代人类从非洲扩张。多态性 MEI 显示出从非洲到欧亚大陆再到美洲的杂合度梯度降低,而且在单个个体中发现的 MEI 数量也呈这种总体分布。在七个主要 SGDP 人群组(FST)中,MEI 多样性的最大比例为 7.4%,与之前的估计相似,但略低,这可能归因于 SGDP 的多样化采样策略。最后,我们利用这些 MEI 来推断美洲原住民的主要共同祖先成分回溯到亚洲,并提供来自全基因组相同遗传标记的新证据,进一步支持大多数美洲原住民起源于西伯利亚东南部。