Malisi Christoph, Kohlbacher Oliver, Höcker Birte
Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany.
Proteins. 2009 Oct;77(1):74-83. doi: 10.1002/prot.22418.
A major goal of computational protein design is the construction of novel functions on existing protein scaffolds. There the first question is which scaffold is suitable for a specific reaction. Given a set of catalytic residues and their spatial arrangement, one wants to identify a protein scaffold that can host this active site. Here, we present an algorithm called ScaffoldSelection that is able to rapidly search large sets of protein structures for potential attachment sites of an enzymatic motif. The method consists of two steps; it first identifies pairs of backbone positions in pocket-like regions. Then, it combines these to complete attachment sites using a graph theoretical approach. Identified matches are assessed for their ability to accommodate the substrate or transition state. A representative set of structures from the Protein Data Bank ( approximately 3500) was searched for backbone geometries that support the catalytic residues for 12 chemical reactions. Recapitulation of native active site geometries is used as a benchmark for the performance of the program. The native motif is identified in all 12 test cases, ranking it in the top percentile in 5 out of 12. The algorithm is fast and efficient, although dependent on the complexity of the motif. Comparisons to other methods show that ScaffoldSelection performs equally well in terms of accuracy and far better in terms of speed. Thus, ScaffoldSelection will aid future computational protein design experiments by preselecting protein scaffolds that are suitable for a specific reaction type and the introduction of a predefined amino acid motif.
计算蛋白质设计的一个主要目标是在现有的蛋白质支架上构建新的功能。首先要解决的问题是哪种支架适合特定的反应。给定一组催化残基及其空间排列,人们希望找到一个能够容纳该活性位点的蛋白质支架。在此,我们提出了一种名为ScaffoldSelection的算法,它能够在大量蛋白质结构中快速搜索酶基序的潜在附着位点。该方法包括两个步骤:首先在口袋状区域识别主链位置对。然后,使用图论方法将这些位置对组合成完整的附着位点。对识别出的匹配项评估其容纳底物或过渡态的能力。在蛋白质数据库(约3500个)中选取一组代表性结构,搜索支持12种化学反应催化残基的主链几何结构。以天然活性位点几何结构的重现作为该程序性能的基准。在所有12个测试案例中都识别出了天然基序,其中5个案例的排名在前百分位。该算法快速高效,不过取决于基序的复杂性。与其他方法的比较表明,ScaffoldSelection在准确性方面表现相当,在速度方面则要好得多。因此,ScaffoldSelection将通过预选适合特定反应类型的蛋白质支架并引入预定义的氨基酸基序,助力未来的计算蛋白质设计实验。