Berger B
Mathematics Department, Massachusetts Institute of Technology, Cambridge 02139, USA.
J Comput Biol. 1995 Spring;2(1):125-38. doi: 10.1089/cmb.1995.2.125.
The identification of protein sequences that fold into certain known three-dimensional (3D) structures, or motifs, is evaluated through a probabilistic analysis of their one-dimensional (1D) sequences. We present a correlation method that runs in linear time and incorporates pairwise dependencies between amino acid residues at multiple distances to assess the conditional probability that a given residue is part of a given 3D structure. This method is generalized to multiple motifs, where a dynamic programming approach leads to an efficient algorithm that runs in linear time for practical problems. By this approach, we were able to distinguish (2-stranded) coiled-coil from non-coiled-coil domains and globins from nonglobins. When tested on the Brookhaven X-ray crystal structure database, the method does not produce any false-positive or false-negative predictions of coiled coils.
通过对蛋白质一维(1D)序列进行概率分析,来评估能够折叠成特定已知三维(3D)结构或基序的蛋白质序列。我们提出了一种线性时间运行的关联方法,该方法纳入了多个距离处氨基酸残基之间的成对依赖性,以评估给定残基是给定3D结构一部分的条件概率。此方法被推广到多个基序,其中动态规划方法产生了一种高效算法,对于实际问题可在线性时间内运行。通过这种方法,我们能够区分(双股)卷曲螺旋结构域与非卷曲螺旋结构域,以及球蛋白与非球蛋白。在布鲁克海文X射线晶体结构数据库上进行测试时,该方法对卷曲螺旋结构不会产生任何假阳性或假阴性预测。