Yang Qingyi, Sharp Kim A
Johnson Research Foundation and Department of Biochemistry and Biophysics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA.
Proteins. 2009 Feb 15;74(3):682-700. doi: 10.1002/prot.22184.
We describe a method for efficiently generating ensembles of alternate, all-atom protein structures that (a) differ significantly from the starting structure, (b) have good stereochemistry (bonded geometry), and (c) have good steric properties (absence of atomic overlap). The method uses reconstruction from a series of backbone framework structures that are obtained from a modified elastic network model (ENM) by perturbation along low-frequency normal modes. To ensure good quality backbone frameworks, the single force parameter ENM is modified by introducing two more force parameters to characterize the interaction between the consecutive carbon alphas and those within the same secondary structure domain. The relative stiffness of the three parameters is parameterized to reproduce B-factors, while maintaining good bonded geometry. After parameterization, violations of experimental Calpha-Calpha distances and Calpha-Calpha-Calpha pseudo angles along the backbone are reduced to less than 1%. Simultaneously, the average B-factor correlation coefficient improves to R = 0.77. Two applications illustrate the potential of the approach. (1) 102,051 protein backbones spanning a conformational space of 15 A root mean square deviation were generated from 148 nonredundant proteins in the PDB database, and all-atom models with minimal bonded and nonbonded violations were produced from this ensemble of backbone structures using the SCWRL side chain building program. (2) Improved backbone templates for homology modeling. Fifteen query sequences were each modeled on two targets. For each of the 30 target frameworks, dozens of improved templates could be produced In all cases, improved full atom homology models resulted, of which 50% could be identified blind using the D-Fire statistical potential.
我们描述了一种有效生成交替全原子蛋白质结构集合的方法,这些结构具有以下特点:(a)与起始结构有显著差异;(b)具有良好的立体化学(键合几何结构);(c)具有良好的空间性质(无原子重叠)。该方法利用从一系列主链框架结构进行重构,这些主链框架结构是通过沿低频简正模式进行微扰,从改进的弹性网络模型(ENM)获得的。为确保高质量的主链框架,通过引入另外两个力参数来修改单力参数ENM,以表征连续碳原子α之间以及同一二级结构域内碳原子α之间的相互作用。对这三个参数的相对刚度进行参数化,以重现B因子,同时保持良好的键合几何结构。参数化后,沿主链的实验Cα - Cα距离和Cα - Cα - Cα伪角的违反情况减少到小于1%。同时,平均B因子相关系数提高到R = 0.77。两个应用实例说明了该方法的潜力。(1)从蛋白质数据银行(PDB)数据库中的148个非冗余蛋白质生成了102,051个跨越15埃均方根偏差构象空间的蛋白质主链,并使用SCWRL侧链构建程序从这个主链结构集合中生成了具有最小键合和非键合违反的全原子模型。(2)用于同源建模的改进主链模板。15个查询序列分别在两个目标上进行建模。对于这30个目标框架中的每一个,都可以生成数十个改进模板。在所有情况下,都得到了改进的全原子同源模型,其中50%可以使用D - Fire统计势进行盲识别。