Vasudevan Sona, Vinayaka C R, Natale Darren A, Huang Hongzhan, Kahsay Robel Y, Wu Cathy H
Department of Biochemistry and Molecular & Cellular Biology, Georgetown University Medical Center, Washington, DC, USA.
Methods Mol Biol. 2011;694:91-105. doi: 10.1007/978-1-60761-977-2_7.
The rapid growth of protein sequence databases has necessitated the development of methods to computationally derive annotation for uncharacterized entries. Most such methods focus on "global" annotation, such as molecular function or biological process. Methods to supply high-accuracy "local" annotation to functional sites based on structural information at the level of individual amino acids are relatively rare. In this chapter we will describe a method we have developed for annotation of functional residues within experimentally-uncharacterized proteins that relies on position-specific site annotation rules (PIR Site Rules) derived from structural and experimental information. These PIR Site Rules are manually defined to allow for conditional propagation of annotation. Each rule specifies a tripartite set of conditions whereby candidates for annotation must pass a whole-protein classification test (that is, have end-to-end match to a whole-protein-based HMM), match a site-specific profile HMM and, finally, match functionally and structurally characterized residues of a template. Positive matches trigger the appropriate annotation for active site residues, binding site residues, modified residues, or other functionally important amino acids. The strict criteria used in this process have rendered high-confidence annotation suitable for UniProtKB/Swiss-Prot features.
蛋白质序列数据库的快速增长使得开发通过计算为未表征条目推导注释的方法成为必要。大多数此类方法专注于“全局”注释,例如分子功能或生物学过程。基于单个氨基酸水平的结构信息为功能位点提供高精度“局部”注释的方法相对较少。在本章中,我们将描述一种我们开发的用于注释实验未表征蛋白质中功能残基的方法,该方法依赖于从结构和实验信息推导的位置特异性位点注释规则(PIR位点规则)。这些PIR位点规则是手动定义的,以允许注释的条件传播。每个规则指定一组三方条件,据此注释候选物必须通过全蛋白分类测试(即与基于全蛋白的隐马尔可夫模型有端到端匹配),匹配位点特异性轮廓隐马尔可夫模型,最后,匹配模板的功能和结构特征残基。阳性匹配会触发对活性位点残基、结合位点残基、修饰残基或其他功能重要氨基酸的适当注释。此过程中使用的严格标准使得高可信度注释适用于UniProtKB/Swiss-Prot特征。