序列相似性与功能相似性之间关系的定量评估。

Quantitative assessment of relationship between sequence similarity and function similarity.

作者信息

Joshi Trupti, Xu Dong

机构信息

Digital Biology Laboratory, Department of Computer Science and Christopher S, Bond Life Sciences Center, University of Missouri, Columbia, Missouri 65211, USA.

出版信息

BMC Genomics. 2007 Jul 9;8:222. doi: 10.1186/1471-2164-8-222.

DOI:10.1186/1471-2164-8-222

PMID:17620139

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1949826/

Abstract

BACKGROUND

Comparative sequence analysis is considered as the first step towards annotating new proteins in genome annotation. However, sequence comparison may lead to creation and propagation of function assignment errors. Thus, it is important to perform a thorough analysis for the quality of sequence-based function assignment using large-scale data in a systematic way.

RESULTS

We present an analysis of the relationship between sequence similarity and function similarity for the proteins in four model organisms, i.e., Arabidopsis thaliana, Saccharomyces cerevisiae, Caenorrhabditis elegans, and Drosophila melanogaster. Using a measure of functional similarity based on the three categories of Gene Ontology (GO) classifications (biological process, molecular function, and cellular component), we quantified the correlation between functional similarity and sequence similarity measured by sequence identity or statistical significance of the alignment and compared such a correlation against randomly chosen protein pairs.

CONCLUSION

Various sequence-function relationships were identified from BLAST versus PSI-BLAST, sequence identity versus Expectation Value, GO indices versus semantic similarity approaches, and within genome versus between genome comparisons, for the three GO categories. Our study provides a benchmark to estimate the confidence in assignment of functions purely based on sequence similarity.

摘要

背景

在基因组注释中，比较序列分析被视为注释新蛋白质的第一步。然而，序列比较可能会导致功能分配错误的产生和传播。因此，使用大规模数据以系统的方式对基于序列的功能分配质量进行全面分析非常重要。

结果

我们对四种模式生物（即拟南芥、酿酒酵母、秀丽隐杆线虫和黑腹果蝇）中的蛋白质序列相似性与功能相似性之间的关系进行了分析。使用基于基因本体论（GO）分类的三个类别（生物过程、分子功能和细胞成分）的功能相似性度量，我们量化了通过序列同一性或比对的统计显著性测量的功能相似性与序列相似性之间的相关性，并将这种相关性与随机选择的蛋白质对进行了比较。

结论

对于三个GO类别，从BLAST与PSI-BLAST、序列同一性与期望值、GO索引与语义相似性方法以及基因组内与基因组间比较中识别出了各种序列-功能关系。我们的研究提供了一个基准，用于估计仅基于序列相似性进行功能分配的置信度。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

序列相似性与功能相似性之间关系的定量评估。

Quantitative assessment of relationship between sequence similarity and function similarity.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

序列相似性与功能相似性之间关系的定量评估。

Quantitative assessment of relationship between sequence similarity and function similarity.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献