Department of Chemistry, Vanderbilt University, Nashville, Tennessee.
Center for Structural Biology, Vanderbilt University, Nashville, Tennessee.
Proteins. 2019 Dec;87(12):1341-1350. doi: 10.1002/prot.25769. Epub 2019 Jul 18.
Computational methods that produce accurate protein structure models from limited experimental data, for example, from nuclear magnetic resonance (NMR) spectroscopy, hold great potential for biomedical research. The NMR-assisted modeling challenge in CASP13 provided a blind test to explore the capabilities and limitations of current modeling techniques in leveraging NMR data which had high sparsity, ambiguity, and error rate for protein structure prediction. We describe our approach to predict the structure of these proteins leveraging the Rosetta software suite. Protein structure models were predicted de novo using a two-stage protocol. First, low-resolution models were generated with the Rosetta de novo method guided by nonambiguous nuclear Overhauser effect (NOE) contacts and residual dipolar coupling (RDC) restraints. Second, iterative model hybridization and fragment insertion with the Rosetta comparative modeling method was used to refine and regularize models guided by all ambiguous and nonambiguous NOE contacts and RDCs. Nine out of 16 of the Rosetta de novo models had the correct fold (global distance test total score > 45) and in three cases high-resolution models were achieved (root-mean-square deviation < 3.5 å). We also show that a meta-approach applying iterative Rosetta + NMR refinement on server-predicted models which employed non-NMR-contacts and structural templates leads to substantial improvement in model quality. Integrating these data-assisted refinement strategies with innovative non-data-assisted approaches which became possible in CASP13 such as high precision contact prediction will in the near future enable structure determination for large proteins that are outside of the realm of conventional NMR.
从有限的实验数据(例如,核磁共振(NMR)光谱)中生成准确蛋白质结构模型的计算方法在生物医学研究中具有巨大的潜力。CASP13 中的 NMR 辅助建模挑战提供了一个盲测,以探索当前建模技术在利用具有高稀疏性、歧义性和错误率的 NMR 数据进行蛋白质结构预测方面的能力和局限性。我们描述了利用 Rosetta 软件套件预测这些蛋白质结构的方法。使用两阶段协议从头预测蛋白质结构模型。首先,使用 Rosetta 从头开始方法生成低分辨率模型,该方法由非歧义核奥弗豪瑟效应(NOE)接触和残差偶极耦合(RDC)约束指导。其次,使用 Rosetta 比较建模方法进行迭代模型杂交和片段插入,以指导所有歧义和非歧义 NOE 接触和 RDC 来细化和正则化模型。16 个 Rosetta 从头开始模型中有 9 个具有正确的折叠(全局距离测试总得分>45),在三种情况下获得了高分辨率模型(均方根偏差<3.5 å)。我们还表明,应用迭代 Rosetta+NMR 精修的元方法对服务器预测的模型,该模型使用非 NMR 接触和结构模板,可以显著提高模型质量。将这些数据辅助精修策略与在 CASP13 中成为可能的创新非数据辅助方法(例如高精度接触预测)相结合,将在不久的将来实现对常规 NMR 之外的大型蛋白质的结构测定。