Goto Naohisa, Kurokawa Ken, Yasunaga Teruo
Department of Genome Informatics, Genome Information Research Center, Research Institute for Microbial Diseases, Osaka University, 3-1 Yamadaoka, Suita, Osaka 565-0871, Japan.
Gene. 2007 Oct 15;401(1-2):172-80. doi: 10.1016/j.gene.2007.07.017. Epub 2007 Aug 1.
To date, the complete genome sequences of more than 250 organisms have been determined. This information can now be used to determine whether there exist any invariant sequences that are conserved among all organisms, from bacteria to plants, animals, and humans. The existence of invariant sequences would strongly suggest that these sequences have been inherited unchanged from the last common ancestor of all life, and that they have essential functions. We have developed a new software program to identify invariant sequences conserved among the currently sequenced genomes and applied this analysis to the complete genome sequences of 266 organisms. We have identified 3 invariant DNA sequences longer than or equal to 11 bp and 6 invariant amino acid sequences longer than or equal to 6 aa. The longest invariant DNA sequence, AAGTCGTACAAGGT (15 bp), was found in the 16S/18S rRNA gene. Two 8 aa sequences, GHVDHGKT in IF2 and EF-Tu and DTPGHVDF in EF-G, were the longest invariant amino acid sequences detected. These sequences could be essential elements from the genome of the last common ancestor and may have remained unchanged throughout evolution.
迄今为止,已测定了250多种生物的完整基因组序列。现在可以利用这些信息来确定是否存在在从细菌到植物、动物和人类的所有生物中都保守的不变序列。不变序列的存在将有力地表明,这些序列是从所有生命的最后一个共同祖先那里原样继承下来的,并且它们具有基本功能。我们开发了一种新的软件程序,用于识别当前已测序基因组中保守的不变序列,并将此分析应用于266种生物的完整基因组序列。我们鉴定出3个长度大于或等于11 bp的不变DNA序列和6个长度大于或等于6 aa的不变氨基酸序列。最长的不变DNA序列AAGTCGTACAAGGT(15 bp)存在于16S/18S rRNA基因中。在IF2和EF-Tu中发现的两个8 aa序列GHVDHGKT以及在EF-G中发现的DTPGHVDF是检测到的最长的不变氨基酸序列。这些序列可能是最后一个共同祖先基因组中的基本元件,并且在整个进化过程中可能一直保持不变。