Department of Biology and School of Informatics and Computing, Indiana University, Bloomington, Indiana, USA.
PLoS Genet. 2013;9(1):e1003242. doi: 10.1371/journal.pgen.1003242. Epub 2013 Jan 24.
The era of whole-genome sequencing has revealed that gene copy-number changes caused by duplication and deletion events have important evolutionary, functional, and phenotypic consequences. Recent studies have therefore focused on revealing the extent of variation in copy-number within natural populations of humans and other species. These studies have found a large number of copy-number variants (CNVs) in humans, many of which have been shown to have clinical or evolutionary importance. For the most part, these studies have failed to detect an important class of gene copy-number polymorphism: gene duplications caused by retrotransposition, which result in a new intron-less copy of the parental gene being inserted into a random location in the genome. Here we describe a computational approach leveraging next-generation sequence data to detect gene copy-number variants caused by retrotransposition (retroCNVs), and we report the first genome-wide analysis of these variants in humans. We find that retroCNVs account for a substantial fraction of gene copy-number differences between any two individuals. Moreover, we show that these variants may often result in expressed chimeric transcripts, underscoring their potential for the evolution of novel gene functions. By locating the insertion sites of these duplicates, we are able to show that retroCNVs have had an important role in recent human adaptation, and we also uncover evidence that positive selection may currently be driving multiple retroCNVs toward fixation. Together these findings imply that retroCNVs are an especially important class of polymorphism, and that future studies of copy-number variation should search for these variants in order to illuminate their potential evolutionary and functional relevance.
全基因组测序时代揭示,基因拷贝数的增加和缺失事件导致的变化对生物进化、功能和表型具有重要影响。因此,最近的研究集中于揭示人类和其他物种自然群体中拷贝数变异的程度。这些研究在人类中发现了大量的拷贝数变异(CNVs),其中许多已经被证明具有临床或进化意义。在很大程度上,这些研究未能检测到一类重要的基因拷贝数多态性:由逆转录转座引起的基因重复,导致新的无内含子的亲本基因拷贝插入基因组的随机位置。在这里,我们描述了一种利用下一代测序数据检测由逆转录转座引起的基因拷贝数变异(retroCNVs)的计算方法,并报告了人类中这些变异的首次全基因组分析。我们发现,retroCNVs 解释了两个个体之间基因拷贝数差异的很大一部分。此外,我们表明,这些变体可能经常导致表达的嵌合转录本,强调了它们在新基因功能进化中的潜力。通过定位这些重复的插入位点,我们能够表明 retroCNVs 在人类的近期适应中发挥了重要作用,并且我们还发现了证据表明,正选择可能目前正在推动多个 retroCNVs 向固定方向发展。这些发现共同表明 retroCNVs 是一类特别重要的多态性,未来的拷贝数变异研究应该寻找这些变体,以阐明它们潜在的进化和功能相关性。