Mosimann S, Meleshko R, James M N
Medical Research Council of Canada, Department of Biochemistry, University of Alberta, Edmonton, Canada.
Proteins. 1995 Nov;23(3):301-17. doi: 10.1002/prot.340230305.
In spite of the tremendous increase in the rate at which protein structures are being determined, there is still an enormous gap between the numbers of known DNA-derived sequences and the numbers of three-dimensional structures. In order to shed light on the biological functions of the molecules, researchers often resort to comparative molecular modeling. Earlier work has shown that when the sequence alignment is in error, then the comparative model is guaranteed to be wrong. In addition, loops, the sites of insertions and deletions in families of homologous proteins, are exceedingly difficult to model. Thus, many of the current problems in comparative molecular modeling are minor versions of the global protein folding problem. In order to assess objectively the current state of comparative molecular modeling, 13 groups submitted blind predictions of seven different proteins of undisclosed tertiary structure. This assessment shows that where sequence identity between the target and the template structure is high (> 70%), comparative molecular modeling is highly successful. On the other hand, automated modeling techniques and sophisticated energy minimization methods fail to improve upon the starting structures when the sequence identity is low (approximately 30%). Based on these results it appears that insertions and deletions are still major problems. Successfully deducing the correct sequence alignment when the local similarity is low is still difficult. We suggest some minimal testing of submitted coordinates that should be required of authors before papers on comparative molecular modeling are accepted for publication in journals.
尽管蛋白质结构的测定速度有了极大提高,但已知的DNA衍生序列数量与三维结构数量之间仍存在巨大差距。为了阐明分子的生物学功能,研究人员常常求助于比较分子建模。早期的研究表明,当序列比对出现错误时,比较模型肯定是错误的。此外,环区,即同源蛋白质家族中插入和缺失的位点,极难建模。因此,比较分子建模中当前存在的许多问题都是全局蛋白质折叠问题的简化版本。为了客观评估比较分子建模的当前状态,13个小组提交了对7种三级结构未知的不同蛋白质的盲预测。这项评估表明,当目标结构与模板结构之间的序列同一性较高(>70%)时,比较分子建模非常成功。另一方面,当序列同一性较低(约30%)时,自动建模技术和复杂的能量最小化方法无法在起始结构的基础上有所改进。基于这些结果,插入和缺失似乎仍然是主要问题。当局部相似性较低时,成功推断出正确的序列比对仍然很困难。我们建议在比较分子建模论文被期刊接受发表之前,作者应对提交的坐标进行一些最低限度的测试。