Peterson Brant K, Hare Emily E, Iyer Venky N, Storage Steven, Conner Laura, Papaj Daniel R, Kurashima Rick, Jang Eric, Eisen Michael B
Department of Molecular and Cell Biology, University of California, Berkeley, California, United States of America.
PLoS One. 2009;4(3):e4688. doi: 10.1371/journal.pone.0004688. Epub 2009 Mar 4.
The identification of regulatory sequences in animal genomes remains a significant challenge. Comparative genomic methods that use patterns of evolutionary conservation to identify non-coding sequences with regulatory function have yielded many new vertebrate enhancers. However, these methods have not contributed significantly to the identification of regulatory sequences in sequenced invertebrate taxa. We demonstrate here that this differential success, which is often attributed to fundamental differences in the nature of vertebrate and invertebrate regulatory sequences, is instead primarily a product of the relatively small size of sequenced invertebrate genomes. We sequenced and compared loci involved in early embryonic patterning from four species of true fruit flies (family Tephritidae) that have genomes four to six times larger than those of Drosophila melanogaster. Unlike in Drosophila, where virtually all non-coding DNA is highly conserved, blocks of conserved non-coding sequence in tephritids are flanked by large stretches of poorly conserved sequence, similar to what is observed in vertebrate genomes. We tested the activities of nine conserved non-coding sequences flanking the even-skipped gene of the teprhitid Ceratis capitata in transgenic D. melanogaster embryos, six of which drove patterns that recapitulate those of known D. melanogaster enhancers. In contrast, none of the three non-conserved tephritid non-coding sequences that we tested drove expression in D. melanogaster embryos. Based on the landscape of non-coding conservation in tephritids, and our initial success in using conservation in tephritids to identify D. melanogaster regulatory sequences, we suggest that comparison of tephritid genomes may provide a systematic means to annotate the non-coding portion of the D. melanogaster genome. We also propose that large genomes be given more consideration in the selection of species for comparative genomics projects, to provide increased power to detect functional non-coding DNAs and to provide a less biased view of the evolution and function of animal genomes.
在动物基因组中识别调控序列仍然是一项重大挑战。利用进化保守模式来识别具有调控功能的非编码序列的比较基因组方法,已经产生了许多新的脊椎动物增强子。然而,这些方法在识别已测序的无脊椎动物类群中的调控序列方面,贡献并不显著。我们在此证明,这种不同的成功率,通常被归因于脊椎动物和无脊椎动物调控序列本质上的根本差异,而实际上主要是已测序的无脊椎动物基因组相对较小的产物。我们对四种实蝇(实蝇科)早期胚胎模式形成中涉及的基因座进行了测序和比较,这四种实蝇的基因组比黑腹果蝇的基因组大4至6倍。与黑腹果蝇不同,在黑腹果蝇中几乎所有非编码DNA都是高度保守的,而实蝇中的保守非编码序列块两侧是大片保守性较差的序列,这与在脊椎动物基因组中观察到的情况相似。我们在转基因黑腹果蝇胚胎中测试了实蝇地中海实蝇偶数跳基因侧翼的九个保守非编码序列的活性,其中六个驱动形成的模式重现了已知的黑腹果蝇增强子的模式。相比之下,我们测试的三个非保守的实蝇非编码序列在黑腹果蝇胚胎中均未驱动表达。基于实蝇中非编码保守情况,以及我们利用实蝇中的保守性来识别黑腹果蝇调控序列的初步成功,我们认为比较实蝇基因组可能为注释黑腹果蝇基因组的非编码部分提供一种系统方法。我们还建议,在选择用于比较基因组学项目的物种时,应更多地考虑大基因组,以增强检测功能性非编码DNA的能力,并提供对动物基因组进化和功能的较少偏差的观点。