Institute for Theoretical Physics, University of Cologne, Köln, Germany.
PLoS Comput Biol. 2011 Oct;7(10):e1002167. doi: 10.1371/journal.pcbi.1002167. Epub 2011 Oct 6.
Turnover of regulatory sequence and function is an important part of molecular evolution. But what are the modes of sequence evolution leading to rapid formation and loss of regulatory sites? Here we show that a large fraction of neighboring transcription factor binding sites in the fly genome have formed from a common sequence origin by local duplications. This mode of evolution is found to produce regulatory information: duplications can seed new sites in the neighborhood of existing sites. Duplicate seeds evolve subsequently by point mutations, often towards binding a different factor than their ancestral neighbor sites. These results are based on a statistical analysis of 346 cis-regulatory modules in the Drosophila melanogaster genome, and a comparison set of intergenic regulatory sequence in Saccharomyces cerevisiae. In fly regulatory modules, pairs of binding sites show significantly enhanced sequence similarity up to distances of about 50 bp. We analyze these data in terms of an evolutionary model with two distinct modes of site formation: (i) evolution from independent sequence origin and (ii) divergent evolution following duplication of a common ancestor sequence. Our results suggest that pervasive formation of binding sites by local sequence duplications distinguishes the complex regulatory architecture of higher eukaryotes from the simpler architecture of unicellular organisms.
调控序列和功能的转变是分子进化的重要组成部分。但是,导致调控位点快速形成和丢失的序列进化模式是什么?在这里,我们表明,在果蝇基因组中,大量相邻的转录因子结合位点是通过局部重复形成的。这种进化模式产生了调控信息:重复可以在现有位点的附近产生新的位点。重复的种子通过点突变进一步进化,通常会朝着与它们祖先邻居位点不同的因子结合的方向进化。这些结果基于对黑腹果蝇基因组中 346 个顺式调控模块的统计分析,以及酿酒酵母中一组基因间调控序列的比较。在果蝇调控模块中,两个结合位点之间的序列相似性在高达约 50bp 的距离内显著增强。我们根据一个具有两种不同位点形成模式的进化模型来分析这些数据:(i)从独立的序列起源进化和(ii)在共同祖先序列的重复后发散进化。我们的结果表明,通过局部序列重复形成结合位点的普遍方式将真核生物的复杂调控结构与单细胞生物的简单结构区分开来。