Center for Biomolecular Science and Engineering, University of California Santa Cruz, Santa Cruz, California, United States of America.
PLoS One. 2012;7(8):e43128. doi: 10.1371/journal.pone.0043128. Epub 2012 Aug 27.
Recent research supports the view that changes in gene regulation, as opposed to changes in the genes themselves, play a significant role in morphological evolution. Gene regulation is largely dependent on transcription factor binding sites. Researchers are now able to use the available 29 mammalian genomes to measure selective constraint at the level of binding sites. This detailed map of constraint suggests that mammalian genomes co-opt fragments of mobile elements to act as gene regulatory sequence on a large scale. In the human genome we detect over 280,000 putative regulatory elements, totaling approximately 7 Mb of sequence, that originated as mobile element insertions. These putative regulatory regions are conserved non-exonic elements (CNEEs), which show considerable cross-species constraint and signatures of continued negative selection in humans, yet do not appear in a known mature transcript. These putative regulatory elements were co-opted from SINE, LINE, LTR and DNA transposon insertions. We demonstrate that at least 11%, and an estimated 20%, of gene regulatory sequence in the human genome showing cross-species conservation was co-opted from mobile elements. The location in the genome of CNEEs co-opted from mobile elements closely resembles that of CNEEs in general, except in the centers of the largest gene deserts where recognizable co-option events are relatively rare. We find that regions of certain mobile element insertions are more likely to be held under purifying selection than others. In particular, we show 6 examples where paralogous instances of an often co-opted mobile element region define a sequence motif that closely matches a transcription factor's binding profile.
最近的研究支持这样一种观点,即基因调控的变化(相对于基因本身的变化)在形态进化中起着重要作用。基因调控在很大程度上依赖于转录因子结合位点。研究人员现在能够利用现有的 29 种哺乳动物基因组来测量结合位点水平的选择约束。这种详细的约束图谱表明,哺乳动物基因组大规模地利用转座元件的片段作为基因调控序列。在人类基因组中,我们检测到超过 280000 个假定的调控元件,总长度约为 7Mb,它们起源于移动元件的插入。这些假定的调控区是保守的非编码元件(CNEEs),它们表现出相当大的跨物种约束和在人类中持续负选择的特征,但不存在于已知的成熟转录本中。这些假定的调控元件是从 SINE、LINE、LTR 和 DNA 转座子插入中共同获得的。我们证明,在人类基因组中,至少有 11%,估计有 20%的表现出跨物种保守性的基因调控序列是从移动元件中共同获得的。从移动元件中共同获得的 CNEEs 在基因组中的位置与一般的 CNEEs 非常相似,除了在最大基因荒漠的中心,那里可识别的共同获得事件相对较少。我们发现,某些移动元件插入区域比其他区域更有可能受到纯化选择的影响。特别是,我们展示了 6 个例子,其中一个经常共同获得的移动元件区域的同源实例定义了一个与转录因子结合谱非常匹配的序列基序。