Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York, USA.
Department of Chemistry and Biochemistry, University of California, Merced, California, USA.
Proteins. 2021 Dec;89(12):1959-1976. doi: 10.1002/prot.26246. Epub 2021 Oct 19.
NMR studies can provide unique information about protein conformations in solution. In CASP14, three reference structures provided by solution NMR methods were available (T1027, T1029, and T1055), as well as a fourth data set of NMR-derived contacts for an integral membrane protein (T1088). For the three targets with NMR-based structures, the best prediction results ranged from very good (GDT_TS = 0.90, for T1055) to poor (GDT_TS = 0.47, for T1029). We explored the basis of these results by comparing all CASP14 prediction models against experimental NMR data. For T1027, NMR data reveal extensive internal dynamics, presenting a unique challenge for protein structure prediction methods. The analysis of T1029 motivated exploration of a novel method of "inverse structure determination," in which an AlphaFold2 model was used to guide NMR data analysis. NMR data provided to CASP predictor groups for target T1088, a 238-residue integral membrane porin, was also used to assess several NMR-assisted prediction methods. Most groups involved in this exercise generated similar beta-barrel models, with good agreement with the experimental data. However, as was also observed in CASP13, some pure prediction groups that did not use any NMR data generated models for T1088 that better fit the NMR data than the models generated using these experimental data. These results demonstrate the remarkable power of modern methods to predict structures of proteins with accuracies rivaling solution NMR structures, and that it is now possible to reliably use prediction models to guide and complement experimental NMR data analysis.
NMR 研究可以提供有关溶液中蛋白质构象的独特信息。在 CASP14 中,有三种基于溶液 NMR 方法的参考结构(T1027、T1029 和 T1055),以及一种用于完整膜蛋白的 NMR 衍生接触的第四组数据集(T1088)。对于具有基于 NMR 的结构的三个目标,最佳预测结果范围从非常好(GDT_TS = 0.90,适用于 T1055)到差(GDT_TS = 0.47,适用于 T1029)。我们通过将所有 CASP14 预测模型与实验 NMR 数据进行比较,探讨了这些结果的基础。对于 T1027,NMR 数据显示出广泛的内部动力学,这对蛋白质结构预测方法提出了独特的挑战。对 T1029 的分析促使我们探索了一种新的“逆结构确定”方法,其中使用 AlphaFold2 模型来指导 NMR 数据分析。还将提供给 CASP 预测者小组用于目标 T1088(一种 238 个残基的完整膜孔蛋白)的 NMR 数据也用于评估几种基于 NMR 的预测方法。参与此练习的大多数小组都生成了类似的β桶模型,与实验数据吻合良好。然而,正如在 CASP13 中也观察到的那样,一些不使用任何 NMR 数据的纯预测小组生成的 T1088 模型比使用这些实验数据生成的模型更符合 NMR 数据。这些结果表明,现代方法具有预测蛋白质结构的强大能力,可以与溶液 NMR 结构相媲美,并且现在可以使用预测模型来可靠地指导和补充实验 NMR 数据分析。