Lopez Daniel, Pazos Florencio
National Centre for Biotechnology, Madrid, Spain.
Proteins. 2009 Aug 15;76(3):598-607. doi: 10.1002/prot.22373.
Most proteins are organized in domains which can be seen as independent modular units in terms of molecular function (MF). Nevertheless, current functional annotations are done on a "whole-chain" basis without associating specific functions to the individual domains. We present here an automatic method for discerning which particular structural domain within a protein is responsible for a given MF originally attributed to the whole protein. By annotating the SCOP structural domains with gene ontology terms using this method, we obtained the first large-scale functional annotation at the domain level. We performed a large-scale comparison of these annotations with the ones implicit in the functional annotations of Interpro signatures, showing that the performance of this method is globally better. We also discuss in detail some particular examples. Generated automatically and available online, this resource could be the basis for future manually curated annotations.
大多数蛋白质是由结构域组成的,就分子功能(MF)而言,这些结构域可被视为独立的模块化单元。然而,目前的功能注释是在“全链”基础上进行的,并未将特定功能与各个结构域相关联。我们在此提出一种自动方法,用于识别蛋白质中的哪个特定结构域对最初归因于整个蛋白质的给定MF负责。通过使用此方法用基因本体术语注释SCOP结构域,我们获得了首个大规模的结构域水平功能注释。我们对这些注释与Interpro签名功能注释中隐含的注释进行了大规模比较,结果表明该方法的性能总体上更好。我们还详细讨论了一些具体示例。该资源是自动生成且在线可用的,可为未来的人工整理注释奠定基础。