Sethupathy Praveen, Giang Hoa, Plotkin Joshua B, Hannenhalli Sridhar
Department of Genetics, School of Medicine, School of Engineering and Applied Sciences, University of Pennsylvania, Philadelphia, Pennsylvania, USA.
PLoS One. 2008 Sep 10;3(9):e3137. doi: 10.1371/journal.pone.0003137.
It has been speculated that the polymorphisms in the non-coding portion of the human genome underlie much of the phenotypic variability among humans and between humans and other primates. If so, these genomic regions may be undergoing rapid evolutionary change, due in part to natural selection. However, the non-coding region is a heterogeneous mix of functional and non-functional regions. Furthermore, the functional regions are comprised of a variety of different types of elements, each under potentially different selection regimes.
Using the HapMap and Perlegen polymorphism data that map to a stringent set of putative binding sites in human proximal promoters, we apply the Derived Allele Frequency distribution test of neutrality to provide evidence that many human-specific and primate-specific binding sites are likely evolving under positive selection. We also discuss inherent limitations of publicly available human SNP datasets that complicate the inference of selection pressures. Finally, we show that the genes whose proximal binding sites contain high frequency derived alleles are enriched for positive regulation of protein metabolism and developmental processes. Thus our genome-scale investigation provides evidence for positive selection on putative transcription factor binding sites in human proximal promoters.
据推测,人类基因组非编码部分的多态性是造成人类之间以及人类与其他灵长类动物之间表型差异的主要原因。如果真是如此,这些基因组区域可能正在经历快速的进化变化,部分原因是自然选择。然而,非编码区域是功能区和非功能区的异质混合。此外,功能区由各种不同类型的元件组成,每个元件可能处于不同的选择机制之下。
利用映射到人类近端启动子中一组严格的假定结合位点的HapMap和Perlegen多态性数据,我们应用中性的衍生等位基因频率分布测试,以提供证据表明许多人类特异性和灵长类特异性结合位点可能在正选择下进化。我们还讨论了公开可用的人类SNP数据集的固有局限性,这些局限性使选择压力的推断变得复杂。最后,我们表明其近端结合位点包含高频衍生等位基因的基因在蛋白质代谢和发育过程的正调控方面富集。因此,我们的全基因组研究为人类近端启动子中假定的转录因子结合位点的正选择提供了证据。