Pal Debnath, Eisenberg David
UCLA-DOE Institute for Genomics and Proteomics, Los Angeles, CA 90095, USA.
Structure. 2005 Jan;13(1):121-30. doi: 10.1016/j.str.2004.10.015.
Structural genomics has brought us three-dimensional structures of proteins with unknown functions. To shed light on such structures, we have developed ProKnow (http://www.doe-mbi.ucla.edu/Services/ProKnow/), which annotates proteins with Gene Ontology functional terms. The method extracts features from the protein such as 3D fold, sequence, motif, and functional linkages and relates them to function via the ProKnow knowledgebase of features, which links features to annotated functions via annotation profiles. Bayes' theorem is used to compute weights of the functions assigned, using likelihoods based on the extracted features. The description level of the assigned function is quantified by the ontology depth (from 1 = general to 9 = specific). Jackknife tests show approximately 89% correct assignments at ontology depth 1 and 40% at depth 9, with 93% coverage of 1507 distinct folded proteins. Overall, about 70% of the assignments were inferred correctly. This level of performance suggests that ProKnow is a useful resource in functional assessments of novel proteins.
结构基因组学为我们带来了功能未知蛋白质的三维结构。为了阐明这些结构,我们开发了ProKnow(http://www.doe-mbi.ucla.edu/Services/ProKnow/),它用基因本体功能术语对蛋白质进行注释。该方法从蛋白质中提取特征,如三维折叠、序列、基序和功能联系,并通过ProKnow特征知识库将它们与功能联系起来,该知识库通过注释概况将特征与注释功能联系起来。贝叶斯定理用于计算分配的功能的权重,使用基于提取特征的似然性。分配功能的描述级别由本体深度量化(从1 = 一般到9 = 特定)。留一法测试表明,在本体深度1时约89%的分配正确,在深度9时为40%,对1507种不同折叠蛋白质的覆盖率为93%。总体而言,约70%的分配被正确推断。这种性能水平表明ProKnow是新型蛋白质功能评估中的一个有用资源。