Gromiha M Michael, Yabuki Yukimitsu, Kundu Srinesh, Suharnan Sivasundaram, Suwa Makiko
Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST) AIST Tokyo Waterfront Bio-IT Research Building, 2-42 Aomi, Koto-ku, Tokyo 135-0064, Japan.
Nucleic Acids Res. 2007 Jan;35(Database issue):D314-6. doi: 10.1093/nar/gkl805. Epub 2006 Nov 6.
We have developed the database, TMBETA-GENOME, for annotated beta-barrel membrane proteins in genomic sequences using statistical methods and machine learning algorithms. The statistical methods are based on amino acid composition, reside pair preference and motifs. In machine learning techniques, the combination of amino acid and dipeptide compositions has been used as main attributes. In addition, annotations have been made using the criterion based on the identification of beta-barrel membrane proteins and exclusion of globular and transmembrane helical proteins. A web interface has been developed for identifying the annotated beta-barrel membrane proteins in all known genomes. The users have the feasibility of selecting the genome from the three kingdoms of life, archaea, bacteria and eukaryote, and five different methods. Further, the statistics for all genomes have been provided along with the links to different algorithms and related databases. It is freely available at http://tmbeta-genome.cbrc.jp/annotation/.
我们利用统计方法和机器学习算法开发了数据库TMBETA - GENOME,用于注释基因组序列中的β - 桶状膜蛋白。统计方法基于氨基酸组成、残基对偏好和基序。在机器学习技术中,氨基酸和二肽组成的组合被用作主要属性。此外,已使用基于β - 桶状膜蛋白鉴定以及球状和跨膜螺旋蛋白排除的标准进行注释。已开发出一个网络界面,用于在所有已知基因组中鉴定注释的β - 桶状膜蛋白。用户可以从生命的三个界,即古细菌、细菌和真核生物中选择基因组,并采用五种不同方法。此外,还提供了所有基因组的统计信息以及到不同算法和相关数据库的链接。该数据库可在http://tmbeta - genome.cbrc.jp/annotation/免费获取。