INSERM UMR-S 665, Dynamique des Structures et Interactions des Macromolécules Biologiques (DSIMB), University Paris-Diderot, Institut National de Transfusion Sanguine, INTS, 6, rue Alexandre Cabanel, 75739 Paris cedex 15, France.
Proteins. 2011 Mar;79(3):839-52. doi: 10.1002/prot.22922. Epub 2010 Dec 6.
Protein structures are valuable tools for understanding protein function. However, protein dynamics is also considered a key element in protein function. Therefore, in addition to structural analysis, fully understanding protein function at the molecular level now requires accounting for flexibility. However, experimental techniques that produce both types of information simultaneously are still limited. Prediction approaches are useful alternative tools for obtaining otherwise unavailable data. It has been shown that protein structure can be described by a limited set of recurring local structures. In this context, we previously established a library composed of 120 overlapping long structural prototypes (LSPs) representing fragments of 11 residues in length and covering all known local protein structures. On the basis of the close sequence-structure relationship observed in LSPs, we developed a novel prediction method that proposes structural candidates in terms of LSPs along a given sequence. The prediction accuracy rate was high given the number of structural classes. In this study, we use this methodology to predict protein flexibility. We first examine flexibility according to two different descriptors, the B-factor and root mean square fluctuations from molecular dynamics simulations. We then show the relevance of using both descriptors together. We define three flexibility classes and propose a method based on the LSP prediction method for predicting flexibility along the sequence. The prediction rate reaches 49.6%. This method competes rather efficiently with the most recent, cutting-edge methods based on true flexibility data learning with sophisticated algorithms. Accordingly, flexibility information should be taken into account in structural prediction assessments.
蛋白质结构是理解蛋白质功能的有价值的工具。然而,蛋白质动力学也被认为是蛋白质功能的关键要素。因此,除了结构分析,现在要在分子水平上全面了解蛋白质功能,还需要考虑其灵活性。然而,能够同时产生这两种信息的实验技术仍然有限。预测方法是获得其他无法获得的数据的有用替代工具。已经表明,蛋白质结构可以用一组有限的重复出现的局部结构来描述。在这种情况下,我们之前建立了一个由 120 个重叠的长结构原型(LSP)组成的库,这些原型代表 11 个残基长的片段,覆盖了所有已知的局部蛋白质结构。基于在 LSP 中观察到的紧密的序列-结构关系,我们开发了一种新的预测方法,该方法根据给定序列中的 LSP 提出结构候选。考虑到结构类别的数量,预测准确率很高。在这项研究中,我们使用这种方法来预测蛋白质的灵活性。我们首先根据两个不同的描述符,即分子动力学模拟的 B 因子和均方根波动来检查灵活性。然后,我们展示了同时使用这两个描述符的相关性。我们定义了三个灵活性类别,并提出了一种基于 LSP 预测方法的方法,用于预测序列上的灵活性。预测率达到 49.6%。该方法与基于真实灵活性数据学习和复杂算法的最新、最先进的方法竞争相当有效。因此,在结构预测评估中应该考虑灵活性信息。