Department of Biochemistry and Molecular Biophysics, Howard Hughes Medical Institute, Center for Computational Biology and Bioinformatics and Initiative in Systems Biology, Columbia University, New York, New York 10032, USA.
Protein Sci. 2013 Apr;22(4):359-66. doi: 10.1002/pro.2225. Epub 2013 Feb 21.
We outline a set of strategies to infer protein function from structure. The overall approach depends on extensive use of homology modeling, the exploitation of a wide range of global and local geometric relationships between protein structures and the use of machine learning techniques. The combination of modeling with broad searches of protein structure space defines a "structural BLAST" approach to infer function with high genomic coverage. Applications are described to the prediction of protein-protein and protein-ligand interactions. In the context of protein-protein interactions, our structure-based prediction algorithm, PrePPI, has comparable accuracy to high-throughput experiments. An essential feature of PrePPI involves the use of Bayesian methods to combine structure-derived information with non-structural evidence (e.g. co-expression) to assign a likelihood for each predicted interaction. This, combined with a structural BLAST approach significantly expands the range of applications of protein structure in the annotation of protein function, including systems level biological applications where it has previously played little role.
我们概述了一套从结构推断蛋白质功能的策略。该方法总体上依赖于同源建模的广泛应用、蛋白质结构之间广泛的全局和局部几何关系的利用,以及机器学习技术的使用。建模与广泛搜索蛋白质结构空间的结合定义了一种“结构 BLAST”方法,用于以高基因组覆盖率推断功能。该方法应用于蛋白质-蛋白质和蛋白质-配体相互作用的预测。在蛋白质-蛋白质相互作用的背景下,我们基于结构的预测算法 PrePPI 的准确性可与高通量实验相媲美。PrePPI 的一个重要特点是使用贝叶斯方法将结构衍生信息与非结构证据(例如共表达)相结合,为每个预测的相互作用分配一个可能性。这一点,加上结构 BLAST 方法,极大地扩展了蛋白质结构在蛋白质功能注释中的应用范围,包括系统水平的生物学应用,在此之前,蛋白质结构在这些应用中几乎没有发挥作用。