果蝇基因组中 CTCF 结合位点的适应性进化。
Adaptive evolution and the birth of CTCF binding sites in the Drosophila genome.
机构信息
Institute for Genomics and Systems Biology, University of Chicago, Chicago, IL, USA.
出版信息
PLoS Biol. 2012;10(11):e1001420. doi: 10.1371/journal.pbio.1001420. Epub 2012 Nov 6.
Changes in the physical interaction between cis-regulatory DNA sequences and proteins drive the evolution of gene expression. However, it has proven difficult to accurately quantify evolutionary rates of such binding change or to estimate the relative effects of selection and drift in shaping the binding evolution. Here we examine the genome-wide binding of CTCF in four species of Drosophila separated by between ∼2.5 and 25 million years. CTCF is a highly conserved protein known to be associated with insulator sequences in the genomes of human and Drosophila. Although the binding preference for CTCF is highly conserved, we find that CTCF binding itself is highly evolutionarily dynamic and has adaptively evolved. Between species, binding divergence increased linearly with evolutionary distance, and CTCF binding profiles are diverging rapidly at the rate of 2.22% per million years (Myr). At least 89 new CTCF binding sites have originated in the Drosophila melanogaster genome since the most recent common ancestor with Drosophila simulans. Comparing these data to genome sequence data from 37 different strains of Drosophila melanogaster, we detected signatures of selection in both newly gained and evolutionarily conserved binding sites. Newly evolved CTCF binding sites show a significantly stronger signature for positive selection than older sites. Comparative gene expression profiling revealed that expression divergence of genes adjacent to CTCF binding site is significantly associated with the gain and loss of CTCF binding. Further, the birth of new genes is associated with the birth of new CTCF binding sites. Our data indicate that binding of Drosophila CTCF protein has evolved under natural selection, and CTCF binding evolution has shaped both the evolution of gene expression and genome evolution during the birth of new genes.
顺式调控 DNA 序列和蛋白质之间物理相互作用的变化驱动了基因表达的进化。然而,准确量化这种结合变化的进化速率,或者估计选择和漂变在塑造结合进化中的相对影响,一直是具有挑战性的。在这里,我们研究了四个不同种属果蝇的基因组范围内 CTCF 的结合情况,它们之间的分化时间约为 250 万年到 2500 万年。CTCF 是一种高度保守的蛋白质,已知与人类和果蝇基因组中的绝缘子序列有关。尽管 CTCF 的结合偏好高度保守,但我们发现 CTCF 的结合本身具有高度的进化动态性,并且已经适应了进化。在物种之间,结合的差异随进化距离线性增加,CTCF 结合谱以每年 2.22%的速度迅速分化。自与黑腹果蝇最近的共同祖先以来,果蝇属至少有 89 个新的 CTCF 结合位点在黑腹果蝇基因组中产生。将这些数据与来自 37 个不同黑腹果蝇品系的基因组序列数据进行比较,我们在新获得和进化保守的结合位点都检测到了选择的迹象。新进化的 CTCF 结合位点比老的位点显示出更强的正选择信号。比较基因表达谱分析显示,与 CTCF 结合位点相邻的基因的表达差异与 CTCF 结合的获得和丧失显著相关。此外,新基因的诞生与新 CTCF 结合位点的诞生有关。我们的数据表明,果蝇 CTCF 蛋白的结合是在自然选择下进化的,CTCF 结合的进化塑造了新基因诞生过程中基因表达和基因组进化的演变。