Bruccoleri R E, Dougherty T J, Davison D B
Bristol-Myers Squibb, Pharmaceutical Research Institute, PO Box 4000, Princeton, NJ 08543-4000, USA.
Nucleic Acids Res. 1998 Oct 1;26(19):4482-6. doi: 10.1093/nar/26.19.4482.
The set of proteins which are conserved across families of microbes contain important targets of new anti-microbial agents. We have developed a simple and efficient computational tool which determines concordances of putative gene products that show sets of proteins conserved across one set of user specified genomes and not present in another set of user specified genomes. The thresholds and the homology scoring criterion are selectable to allow the user to decide the stringency of the homologies. The system uses a relational database to store protein coding regions from different genomes, and to store the results of a complete comparison of all sequences against all sequences using the FASTA program. Using Web technology, the display of all the related proteins for a given sequence and calculation of multiple sequence alignments (using CLUSTALW) can be performed with the click of a button. The current database holds 97 365 sequences from 19 complete or partial genomes and 8798905 FASTA comparison results. A example concordance is presented which demonstrates that the target of the quinolone antibiotics could have been identified using this tool.
在微生物家族中保守的蛋白质集合包含新型抗菌剂的重要靶点。我们开发了一种简单高效的计算工具,它可以确定假定基因产物的一致性,这些假定基因产物显示出在一组用户指定的基因组中保守而在另一组用户指定的基因组中不存在的蛋白质集合。阈值和同源性评分标准是可选择的,以便用户决定同源性的严格程度。该系统使用关系数据库来存储来自不同基因组的蛋白质编码区域,并使用FASTA程序存储所有序列与所有序列的完整比较结果。利用网络技术,只需点击一个按钮,就能显示给定序列的所有相关蛋白质,并计算多序列比对(使用CLUSTALW)。当前数据库包含来自19个完整或部分基因组的97365个序列以及8798905个FASTA比较结果。给出了一个一致性示例,表明使用该工具可以识别喹诺酮类抗生素的靶点。