Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA.
Proteins. 2012 Jul;80(7):1715-35. doi: 10.1002/prot.24065. Epub 2012 Apr 13.
Ab initio protein folding is one of the major unsolved problems in computational biology owing to the difficulties in force field design and conformational search. We developed a novel program, QUARK, for template-free protein structure prediction. Query sequences are first broken into fragments of 1-20 residues where multiple fragment structures are retrieved at each position from unrelated experimental structures. Full-length structure models are then assembled from fragments using replica-exchange Monte Carlo simulations, which are guided by a composite knowledge-based force field. A number of novel energy terms and Monte Carlo movements are introduced and the particular contributions to enhancing the efficiency of both force field and search engine are analyzed in detail. QUARK prediction procedure is depicted and tested on the structure modeling of 145 nonhomologous proteins. Although no global templates are used and all fragments from experimental structures with template modeling score >0.5 are excluded, QUARK can successfully construct 3D models of correct folds in one-third cases of short proteins up to 100 residues. In the ninth community-wide Critical Assessment of protein Structure Prediction experiment, QUARK server outperformed the second and third best servers by 18 and 47% based on the cumulative Z-score of global distance test-total scores in the FM category. Although ab initio protein folding remains a significant challenge, these data demonstrate new progress toward the solution of the most important problem in the field.
从头蛋白质折叠是计算生物学中尚未解决的主要问题之一,这是由于力场设计和构象搜索的困难。我们开发了一种新的程序 QUARK,用于无模板的蛋白质结构预测。查询序列首先被分解为 1-20 个残基的片段,在每个位置都从无关的实验结构中检索多个片段结构。然后使用复制交换蒙特卡罗模拟从片段组装全长结构模型,该模拟由复合基于知识的力场指导。引入了一些新的能量项和蒙特卡罗运动,并详细分析了它们对提高力场和搜索引擎效率的特殊贡献。描述并测试了 QUARK 预测程序在 145 个非同源蛋白质的结构建模上的应用。尽管没有使用全局模板,并且排除了模板建模得分>0.5 的所有实验结构的片段,但 QUARK 可以在三分之一的短蛋白质(长度达 100 个残基)中成功构建正确折叠的 3D 模型。在第九次蛋白质结构预测领域的广泛的关键评估中,QUARK 服务器在 FM 类别中基于全局距离测试-总得分的累积 Z 分数,比第二和第三好的服务器分别高出 18%和 47%。尽管从头蛋白质折叠仍然是一个重大挑战,但这些数据表明在该领域最重要问题的解决方案上取得了新的进展。