Chen Shuo, Lesnik Elena A, Hall Thomas A, Sampath Rangarajan, Griffey Richard H, Ecker Dave J, Blyn Lawrence B
Ibis Therapeutics, Isis Pharmaceuticals, Inc, 2292 Faraday Ave, Carlsbad, CA 92008, USA.
Biosystems. 2002 Mar-May;65(2-3):157-77. doi: 10.1016/s0303-2647(02)00013-8.
The recent explosion in available bacterial genome sequences has initiated the need to improve an ability to annotate important sequence and structural elements in a fast, efficient and accurate manner. In particular, small non-coding RNAs (sRNAs) have been difficult to predict. The sRNAs play an important number of structural, catalytic and regulatory roles in the cell. Although a few groups have recently published prediction methods for annotating sRNAs in bacterial genome, much remains to be done in this field. Toward the goal of developing an efficient method for predicting unknown sRNA genes in the completed Escherichia coli genome, we adopted a bioinformatics approach to search for DNA regions that contain a sigma70 promoter within a short distance of a rho-independent terminator. Among a total of 227 candidate sRNA genes initially identified, 32 were previously described sRNAs, orphan tRNAs, and partial tRNA and rRNA operons. Fifty-one are mRNAs genes encoding annotated extremely small open reading frames (ORFs) following an acceptable ribosome binding site. One hundred forty-four are potentially novel non-translatable sRNA genes. Using total RNA isolated from E. coli MG1655 cells grown under four different conditions, we verified transcripts of some of the genes by Northern hybridization. Here we summarize our data and discuss the rules and advantages/disadvantages of using this approach in annotating sRNA genes on bacterial genomes.
最近可用的细菌基因组序列急剧增加,这引发了人们对提高快速、高效且准确注释重要序列和结构元件能力的需求。特别是,小非编码RNA(sRNA)一直难以预测。sRNA在细胞中发挥着许多重要的结构、催化和调节作用。尽管最近有几个研究小组发表了在细菌基因组中注释sRNA的预测方法,但该领域仍有许多工作要做。为了开发一种在已完成的大肠杆菌基因组中预测未知sRNA基因的有效方法,我们采用了生物信息学方法来搜索在不依赖ρ因子的终止子短距离内包含σ70启动子的DNA区域。在最初鉴定出的总共227个候选sRNA基因中,有32个是先前描述的sRNA、孤儿tRNA以及部分tRNA和rRNA操纵子。51个是在可接受的核糖体结合位点后编码注释的极小开放阅读框(ORF)的mRNA基因。144个是潜在的新型不可翻译sRNA基因。我们使用从在四种不同条件下生长的大肠杆菌MG1655细胞中分离的总RNA,通过Northern杂交验证了其中一些基因的转录本。在此我们总结我们的数据,并讨论使用这种方法注释细菌基因组中sRNA基因的规则以及优缺点。