Garbuzynskiy Sergiy O, Lobanov Michail Yu, Galzitskaya Oxana V
Institute of Protein Research, Russian Academy of Sciences, 142290 Pushchino, Moscow Region, Russia.
Protein Sci. 2004 Nov;13(11):2871-7. doi: 10.1110/ps.04881304.
The lack of ordered structure in "natively unfolded" proteins raises a general question: Are there intrinsic properties of amino acid residues that are responsible for the absence of fixed structure at physiological conditions? In this article, we demonstrate that the competence of a protein to be folded or to be unfolded may be determined by the property of amino acid residues to form a sufficient number of contacts in a globular state. The expected average number of contacts per residue calculated from the amino acid sequence alone (using the average number of contacts for 20 amino acid residues in globular proteins) can be used as one of the simple indicators of natively unfolded proteins. The prediction accuracy for the sets of 80 folded and 90 natively unfolded proteins reaches 89% if the expected average number of contacts is used as a parameter and 83% in the case of hydrophobicity. An optimal set of artificial parameters for 20 amino acid residues obtained by Monte Carlo algorithm to maximally separate the sets of 90 natively unfolded and 80 folded proteins demonstrates the upper limit for prediction accuracy, which is 95%.
“天然未折叠”蛋白质缺乏有序结构引发了一个普遍问题:氨基酸残基的内在特性是否导致其在生理条件下缺乏固定结构?在本文中,我们证明了蛋白质折叠或未折叠的能力可能由氨基酸残基在球状状态下形成足够数量接触的特性决定。仅根据氨基酸序列计算的每个残基预期平均接触数(使用球状蛋白质中20种氨基酸残基的平均接触数)可作为天然未折叠蛋白质的简单指标之一。如果将预期平均接触数用作参数,80种折叠蛋白和90种天然未折叠蛋白集的预测准确率达到89%,而在疏水性情况下为83%。通过蒙特卡罗算法获得的用于最大程度区分90种天然未折叠蛋白和80种折叠蛋白集的20种氨基酸残基的最佳人工参数集显示了预测准确率的上限,即95%。