Opron Kristopher, Xia Kelin, Wei Guo-Wei
Department of Biochemistry and Molecular Biology, Michigan State University, Michigan 48824, USA.
Department of Mathematics, Michigan State University, Michigan 48824, USA.
J Chem Phys. 2014 Jun 21;140(23):234105. doi: 10.1063/1.4882258.
Protein structural fluctuation, typically measured by Debye-Waller factors, or B-factors, is a manifestation of protein flexibility, which strongly correlates to protein function. The flexibility-rigidity index (FRI) is a newly proposed method for the construction of atomic rigidity functions required in the theory of continuum elasticity with atomic rigidity, which is a new multiscale formalism for describing excessively large biomolecular systems. The FRI method analyzes protein rigidity and flexibility and is capable of predicting protein B-factors without resorting to matrix diagonalization. A fundamental assumption used in the FRI is that protein structures are uniquely determined by various internal and external interactions, while the protein functions, such as stability and flexibility, are solely determined by the structure. As such, one can predict protein flexibility without resorting to the protein interaction Hamiltonian. Consequently, bypassing the matrix diagonalization, the original FRI has a computational complexity of O(N(2)). This work introduces a fast FRI (fFRI) algorithm for the flexibility analysis of large macromolecules. The proposed fFRI further reduces the computational complexity to O(N). Additionally, we propose anisotropic FRI (aFRI) algorithms for the analysis of protein collective dynamics. The aFRI algorithms permit adaptive Hessian matrices, from a completely global 3N × 3N matrix to completely local 3 × 3 matrices. These 3 × 3 matrices, despite being calculated locally, also contain non-local correlation information. Eigenvectors obtained from the proposed aFRI algorithms are able to demonstrate collective motions. Moreover, we investigate the performance of FRI by employing four families of radial basis correlation functions. Both parameter optimized and parameter-free FRI methods are explored. Furthermore, we compare the accuracy and efficiency of FRI with some established approaches to flexibility analysis, namely, normal mode analysis and Gaussian network model (GNM). The accuracy of the FRI method is tested using four sets of proteins, three sets of relatively small-, medium-, and large-sized structures and an extended set of 365 proteins. A fifth set of proteins is used to compare the efficiency of the FRI, fFRI, aFRI, and GNM methods. Intensive validation and comparison indicate that the FRI, particularly the fFRI, is orders of magnitude more efficient and about 10% more accurate overall than some of the most popular methods in the field. The proposed fFRI is able to predict B-factors for α-carbons of the HIV virus capsid (313 236 residues) in less than 30 seconds on a single processor using only one core. Finally, we demonstrate the application of FRI and aFRI to protein domain analysis.
蛋白质结构波动通常由德拜-瓦勒因子(Debye-Waller factors)或B因子来衡量,它是蛋白质灵活性的一种表现,与蛋白质功能密切相关。灵活性-刚性指数(FRI)是一种新提出的方法,用于构建连续介质弹性理论中所需的原子刚性函数,该理论结合了原子刚性,是一种用于描述超大生物分子系统的新多尺度形式。FRI方法分析蛋白质的刚性和灵活性,并且能够在不进行矩阵对角化的情况下预测蛋白质B因子。FRI中使用的一个基本假设是,蛋白质结构由各种内部和外部相互作用唯一确定,而蛋白质功能,如稳定性和灵活性,则完全由结构决定。因此,人们可以在不借助蛋白质相互作用哈密顿量的情况下预测蛋白质的灵活性。因此,绕过矩阵对角化,原始的FRI具有O(N(2))的计算复杂度。这项工作引入了一种用于大分子灵活性分析的快速FRI(fFRI)算法。所提出的fFRI进一步将计算复杂度降低到O(N)。此外,我们还提出了各向异性FRI(aFRI)算法用于分析蛋白质集体动力学。aFRI算法允许使用自适应海森矩阵,从完全全局的3N×3N矩阵到完全局部的3×3矩阵。这些3×3矩阵尽管是局部计算的,但也包含非局部相关信息。从所提出的aFRI算法获得的特征向量能够展示集体运动。此外,我们通过使用四类径向基相关函数来研究FRI的性能。探索了参数优化和无参数的FRI方法。此外,我们将FRI的准确性和效率与一些已确立的灵活性分析方法进行了比较,即正常模式分析和高斯网络模型(GNM)。使用四组蛋白质测试了FRI方法的准确性,三组分别为相对较小、中等和较大尺寸的结构,以及一组包含365种蛋白质的扩展数据集。使用第五组蛋白质来比较FRI、fFRI、aFRI和GNM方法的效率。大量的验证和比较表明,FRI,特别是fFRI,比该领域一些最流行的方法效率高几个数量级,总体准确性高约10%。所提出的fFRI能够在单处理器上仅使用一个核心,在不到30秒的时间内预测HIV病毒衣壳(313 236个残基)α-碳原子的B因子。最后,我们展示了FRI和aFRI在蛋白质结构域分析中的应用。