Gunma University, Kiryu, Japan.
Adv Exp Med Biol. 2018;1105:123-152. doi: 10.1007/978-981-13-2200-6_9.
Predicting three dimensional residue-residue contacts from evolutionary information in protein sequences was attempted already in the early 1990s. However, contact prediction accuracies of methods evaluated in CASP experiments before CASP11 remained quite low, typically with <20% true positives. Recently, contact prediction has been significantly improved to the level that an accurate three dimensional model of a large protein can be generated on the basis of predicted contacts. This improvement was attained by disentangling direct from indirect correlations in amino acid covariations or cosubstitutions between sites in protein evolution. Here, we review statistical methods for extracting causative correlations and various approaches to describe protein structure, complex, and flexibility based on predicted contacts.
早在 20 世纪 90 年代初,就有人尝试从蛋白质序列中的进化信息预测三维残基-残基接触。然而,在 CASP11 之前的 CASP 实验中评估的方法的接触预测准确率仍然相当低,通常只有<20%的真阳性。最近,接触预测得到了显著的提高,以至于可以根据预测的接触来生成一个大型蛋白质的准确三维模型。这种改进是通过解开氨基酸共变或蛋白质进化中位点之间的共替代的直接和间接相关性来实现的。在这里,我们回顾了提取因果相关性的统计方法以及基于预测接触描述蛋白质结构、复合物和柔性的各种方法。