Brief Bioinform. 2019 Jul 19;20(4):1063-1070. doi: 10.1093/bib/bbx117.
For the past 20 years, the Clusters of Orthologous Genes (COG) database had been a popular tool for microbial genome annotation and comparative genomics. Initially created for the purpose of evolutionary classification of protein families, the COG have been used, apart from straightforward functional annotation of sequenced genomes, for such tasks as (i) unification of genome annotation in groups of related organisms; (ii) identification of missing and/or undetected genes in complete microbial genomes; (iii) analysis of genomic neighborhoods, in many cases allowing prediction of novel functional systems; (iv) analysis of metabolic pathways and prediction of alternative forms of enzymes; (v) comparison of organisms by COG functional categories; and (vi) prioritization of targets for structural and functional characterization. Here we review the principles of the COG approach and discuss its key advantages and drawbacks in microbial genome analysis.
在过去的 20 年中,同源基因簇 (COG) 数据库一直是微生物基因组注释和比较基因组学的常用工具。该数据库最初是为了对蛋白质家族进行进化分类而创建的,除了对测序基因组进行直接的功能注释外,还可用于以下任务:(i) 对相关生物群体的基因组注释进行统一;(ii) 识别完整微生物基因组中缺失和/或未检测到的基因;(iii) 分析基因组邻居关系,在许多情况下可预测新的功能系统;(iv) 分析代谢途径并预测酶的替代形式;(v) 通过 COG 功能类别比较生物体;(vi) 为结构和功能表征确定目标的优先级。本文回顾了 COG 方法的原理,并讨论了其在微生物基因组分析中的主要优点和缺点。