Rangannan Vetriselvi, Bansal Manju
Molecular Biophysics Unit, Indian Institute of Science, Bangalore-560 012, India.
BMC Res Notes. 2011 Jul 22;4:257. doi: 10.1186/1756-0500-4-257.
As more and more genomes are being sequenced, an overview of their genomic features and annotation of their functional elements, which control the expression of each gene or transcription unit of the genome, is a fundamental challenge in genomics and bioinformatics.
Relative stability of DNA sequence has been used to predict promoter regions in 913 microbial genomic sequences with GC-content ranging from 16.6% to 74.9%. Irrespective of the genome GC-content the relative stability based promoter prediction method has already been proven to be robust in terms of recall and precision. The predicted promoter regions for the 913 microbial genomes have been accumulated in a database called PromBase. Promoter search can be carried out in PromBase either by specifying the gene name or the genomic position. Each predicted promoter region has been assigned to a reliability class (low, medium, high, very high and highest) based on the difference between its average free energy and the downstream region. The recall and precision values for each class are shown graphically in PromBase. In addition, PromBase provides detailed information about base composition, CDS and CG/TA skews for each genome and various DNA sequence dependent structural properties (average free energy, curvature and bendability) in the vicinity of all annotated translation start sites (TLS).
PromBase is a database, which contains predicted promoter regions and detailed analysis of various genomic features for 913 microbial genomes. PromBase can serve as a valuable resource for comparative genomics study and help the experimentalist to rapidly access detailed information on various genomic features and putative promoter regions in any given genome. This database is freely accessible for academic and non- academic users via the worldwide web http://nucleix.mbu.iisc.ernet.in/prombase/.
随着越来越多的基因组被测序,对其基因组特征进行概述以及对控制基因组中每个基因或转录单元表达的功能元件进行注释,是基因组学和生物信息学面临的一项基本挑战。
DNA序列的相对稳定性已被用于预测913个微生物基因组序列中的启动子区域,这些基因组的GC含量范围为16.6%至74.9%。无论基因组的GC含量如何,基于相对稳定性的启动子预测方法在召回率和精确率方面已被证明是可靠的。913个微生物基因组预测的启动子区域已积累在一个名为PromBase的数据库中。可以通过指定基因名称或基因组位置在PromBase中进行启动子搜索。根据每个预测启动子区域的平均自由能与其下游区域的差异,已将其分配到一个可靠性类别(低、中、高、非常高和最高)。PromBase以图形方式显示了每个类别的召回率和精确率值。此外,PromBase提供了每个基因组的碱基组成、编码序列以及CG/TA偏斜的详细信息,以及所有注释的翻译起始位点(TLS)附近各种依赖于DNA序列的结构特性(平均自由能、曲率和柔韧性)。
PromBase是一个数据库,其中包含913个微生物基因组预测的启动子区域以及对各种基因组特征的详细分析。PromBase可作为比较基因组学研究的宝贵资源,并帮助实验人员快速获取任何给定基因组中各种基因组特征和假定启动子区域的详细信息。该数据库可供学术和非学术用户通过万维网http://nucleix.mbu.iisc.ernet.in/prombase/免费访问。