Kawahara Yoshihiro, Matsuo Takashi, Nozawa Masafumi, Shin-I Tadasu, Kohara Yuji, Aigaki Toshiro
Department of Biological Sciences, Tokyo Metropolitan University, 1-1 Minami-osawa, Hachioji-shi, Tokyo 192-0397, Japan.
Genes Genet Syst. 2004 Dec;79(6):351-9. doi: 10.1266/ggs.79.351.
Comparative sequence analysis among closely related species is essential for investigating the evolution of non-coding sequences, which evolve more rapidly than protein-coding sequences. We sequenced the cytogenetic map 56F10-16, a gene-dense region of D. simulans and D. sechellia, closely related species to D. melanogaster. About 57 kb of the genomic sequences containing 19 genes were annotated from each species according to the corresponding region of the D. melanogaster genome. The order and orientation of genes were perfectly conserved among the three species, and no transposable elements were found. The rate of nucleotide substitutions in the non-coding sequences was lower than that at the fourfold-degenerate sites, implying functional constraints in the non-coding regions. The sequence information from three closely related species, allowed us to estimate the insertions and the deletions that may have occurred in the lineages of D. simulans and D. sechellia using the D. melanogaster sequence as an outgroup. The number of deletions was twice that of insertions for the introns of D. simulans. More remarkably, the deletion outnumbered insertions by 7.5 times for the intergenic sequences of D. sechellia. These results suggest that the non-coding sequences have been shortened by deletion biases. However, the deletion bias was lower than that previously estimated for pseudogenes, suggesting that the non-coding sequences are already rich in functional elements, possibly involved in the regulation of gene expression including transcription and pre-mRNA processing. These features of non-coding sequences may be common to other gene-dense regions contributing to the compactness of the Drosophila genome.
对密切相关物种进行比较序列分析,对于研究非编码序列的进化至关重要,因为非编码序列的进化速度比蛋白质编码序列更快。我们对黑腹果蝇的近缘物种拟果蝇和塞舌尔果蝇的细胞遗传图谱56F10 - 16(一个基因密集区域)进行了测序。根据黑腹果蝇基因组的相应区域,从每个物种中注释出了约57 kb包含19个基因的基因组序列。这三个物种的基因顺序和方向完全保守,并且未发现转座元件。非编码序列中的核苷酸替换率低于四倍简并位点处的替换率,这意味着非编码区域存在功能限制。来自三个密切相关物种的序列信息,使我们能够以黑腹果蝇序列作为外群,估计拟果蝇和塞舌尔果蝇谱系中可能发生的插入和缺失情况。拟果蝇内含子的缺失数量是插入数量的两倍。更显著的是,塞舌尔果蝇基因间序列的缺失数量比插入数量多7.5倍。这些结果表明,非编码序列因缺失偏向而缩短。然而,这种缺失偏向低于先前对假基因的估计,这表明非编码序列已经富含功能元件,可能参与包括转录和前体mRNA加工在内的基因表达调控。非编码序列的这些特征可能在其他有助于果蝇基因组紧凑性的基因密集区域中也很常见。