Bansal A K
Department of Mathematics and Computer Science, Kent State University, OH 44242, USA.
Bioinformatics. 1999 Nov;15(11):900-8. doi: 10.1093/bioinformatics/15.11.900.
As sequenced genomes become larger and sequencing becomes faster, there is a need to develop accurate automated genome comparison techniques and databases to facilitate derivation of genome functionality; identification of enzymes, putative operons and metabolic pathways; and to derive phylogenetic classification of microbes.
This paper extends an automated pair-wise genome comparison technique (Bansal et al., Math. Model. Sci. Comput., 9, 1-23, 1998, Bansal and Bork, in First International Workshop of Declarative Languages, Springer, pp. 275-289, 1999) used to identify orthologs and gene groups to derive orthologous genes in a group of genomes and to identify genes with conserved functionality. Seventeen microbial genomes archived at ftp://ncbi.nlm.nih.gov/genbank/genomes have been compared using the automated technique. Data related to orthologs, gene groups, gene duplication, gene fusion, orthologs with conserved functionality, and genes specifically orthologous to Escherichia coli and pathogens has been presented and analyzed.
A prototype database is available at ftp://www.mcs.kent.edu/arvind/intellibio / orthos.html. The software is free for academic research under an academic license. The detailed database for every microbial genome in NCBI is commercially available through intellibio software and consultancy corporation (Web site: http://www.mcs.kent.edu/årvind/intellibio . html).
随着测序基因组规模不断增大以及测序速度不断加快,需要开发准确的自动化基因组比较技术和数据库,以促进基因组功能推导、酶的鉴定、假定操纵子和代谢途径的识别,并推导微生物的系统发育分类。
本文扩展了一种用于识别直系同源基因和基因群组的自动化成对基因组比较技术(Bansal等人,《数学模型与科学计算》,9,1 - 23,1998;Bansal和Bork,在第一届声明性语言国际研讨会,施普林格出版社,第275 - 289页,1999),以在一组基因组中推导直系同源基因并识别具有保守功能的基因。已使用该自动化技术对保存在ftp://ncbi.nlm.nih.gov/genbank/genomes的17个微生物基因组进行了比较。呈现并分析了与直系同源基因、基因群组、基因复制、基因融合、具有保守功能的直系同源基因以及与大肠杆菌和病原体特异性直系同源的基因相关的数据。
一个原型数据库可在ftp://www.mcs.kent.edu/arvind/intellibio / orthos.html获取。该软件在学术许可下供学术研究免费使用。NCBI中每个微生物基因组的详细数据库可通过intellibio软件和咨询公司(网站:http://www.mcs.kent.edu/årvind/intellibio . html)商业获取。