Suppr超能文献

通过优化特定域的对总和分数来改进域级直系同源聚类。

Improvement of domain-level ortholog clustering by optimizing domain-specific sum-of-pairs score.

机构信息

National Institute for Basic Biology, National Institutes of Natural Sciences, Nishigonaka 38, Myodaiji, Okazaki 444-8585, Japan.

出版信息

BMC Bioinformatics. 2014 May 18;15:148. doi: 10.1186/1471-2105-15-148.

Abstract

BACKGROUND

Identification of ortholog groups is a crucial step in comparative analysis of multiple genomes. Although several computational methods have been developed to create ortholog groups, most of those methods do not evaluate orthology at the sub-gene level. In our method for domain-level ortholog clustering, DomClust, proteins are split into domains on the basis of alignment boundaries identified by all-against-all pairwise comparison, but it often fails to determine appropriate boundaries.

RESULTS

We developed a method to improve domain-level ortholog classification using multiple alignment information. This method is based on a scoring scheme, the domain-specific sum-of-pairs (DSP) score, which evaluates ortholog clustering results at the domain level as the sum total of domain-level alignment scores. We developed a refinement pipeline to improve domain-level clustering, DomRefine, by optimizing the DSP score. We applied DomRefine to domain-level ortholog groups created by DomClust using a dataset obtained from the Microbial Genome Database for Comparative Analysis (MBGD), and evaluated the results using COG clusters and TIGRFAMs models as the reference data. Thus, we observed that the agreement between the resulting classification and the classifications in the reference databases is improved at almost every step in the refinement pipeline. Moreover, the refined classification showed better agreement than the classifications in the eggNOG databases when TIGRFAMs was used as the reference database.

CONCLUSIONS

DomRefine is a useful tool for improving the quality of domain-level ortholog classification among microbial genomes. Combining with a rapid domain-level ortholog clustering method, such as DomClust, it can be used to create a high-quality ortholog database that can serve as a solid basis for various comparative genome analyses.

摘要

背景

在多个基因组的比较分析中,鉴定直系同源物是至关重要的一步。尽管已经开发了几种计算方法来创建直系同源物,但大多数方法都没有在亚基因水平上评估同源性。在我们的基于域的直系同源聚类方法 DomClust 中,蛋白质根据通过所有对所有两两比较确定的对齐边界被分割成域,但它经常无法确定适当的边界。

结果

我们开发了一种使用多重比对信息改进域级直系同源分类的方法。该方法基于一种评分方案,即域特定的对和(DSP)评分,它将域级比对评分的总和作为域级直系同源聚类结果的评估。我们开发了一个优化 DSP 评分的细化流水线 DomRefine,用于改进域级聚类。我们使用从微生物基因组数据库比较分析(MBGD)获得的数据集将 DomRefine 应用于 DomClust 创建的域级直系同源物组,并使用 COG 簇和 TIGRFAMs 模型作为参考数据评估结果。因此,我们观察到在细化流水线的几乎每个步骤中,结果分类与参考数据库中的分类之间的一致性都得到了提高。此外,当使用 TIGRFAMs 作为参考数据库时,与 eggNOG 数据库中的分类相比,细化后的分类具有更好的一致性。

结论

DomRefine 是一种用于提高微生物基因组中域级直系同源分类质量的有用工具。与快速的域级直系同源聚类方法(如 DomClust)结合使用,它可以用于创建高质量的直系同源物数据库,为各种比较基因组分析提供坚实的基础。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c3b0/4035852/1718d68b3ca8/1471-2105-15-148-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验