Department of Biological Statistics and Computational Biology, Cornell Center for Comparative and Population Genomics, Cornell University, Ithaca, New York 14853, USA.
Genome Res. 2009 Nov;19(11):1929-41. doi: 10.1101/gr.084228.108. Epub 2009 Oct 3.
Genome assemblies are now available for nine primate species, and large-scale sequencing projects are underway or approved for six others. An explicitly evolutionary and phylogenetic approach to comparative genomics, called phylogenomics, will be essential in unlocking the valuable information about evolutionary history and genomic function that is contained within these genomes. However, most phylogenomic analyses so far have ignored the effects of variation in ancestral populations on patterns of sequence divergence. These effects can be pronounced in the primates, owing to large ancestral effective population sizes relative to the intervals between speciation events. In particular, local genealogies can vary considerably across loci, which can produce biases and diminished power in many phylogenomic analyses of interest, including phylogeny reconstruction, the identification of functional elements, and the detection of natural selection. At the same time, this variation in genealogies can be exploited to gain insight into the nature of ancestral populations. In this Perspective, I explore this area of intersection between phylogenetics and population genetics, and its implications for primate phylogenomics. I begin by "lifting the hood" on the conventional tree-like representation of the phylogenetic relationships between species, to expose the population-genetic processes that operate along its branches. Next, I briefly review an emerging literature that makes use of the complex relationships among coalescence, recombination, and speciation to produce inferences about evolutionary histories, ancestral populations, and natural selection. Finally, I discuss remaining challenges and future prospects at this nexus of phylogenetics, population genetics, and genomics.
目前已有 9 种灵长类动物的基因组组装可供使用,并且还有 6 种其他灵长类动物的大规模测序项目正在进行或已获得批准。一种名为系统发育基因组学的明确的进化和系统发育比较基因组学方法,对于揭示这些基因组中包含的有关进化历史和基因组功能的有价值信息至关重要。但是,到目前为止,大多数系统发育基因组分析都忽略了祖先群体中变异对序列分歧模式的影响。由于祖先的有效群体大小相对于物种形成事件之间的间隔相对较大,因此这些影响在灵长类动物中可能非常明显。特别是,局部系统发生可以在多个基因座上有很大差异,这会导致许多感兴趣的系统发育基因组分析中的偏差和减弱的能力,包括系统发育重建、功能元件的鉴定以及自然选择的检测。同时,这种系统发生的变化可以用来深入了解祖先群体的性质。在本观点中,我探讨了系统发生学和群体遗传学之间的这一交叉领域,以及其对灵长类系统发育基因组学的影响。我首先“揭开”了物种之间系统发育关系的传统树状表示形式的“ hood”,以揭示沿其分支进行的群体遗传过程。接下来,我简要回顾了新兴文献,该文献利用合并,重组和物种形成之间的复杂关系来推断进化历史,祖先群体和自然选择。最后,我讨论了在系统发生学,群体遗传学和基因组学这一交点上的剩余挑战和未来前景。