Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.
Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.
J Magn Reson. 2023 Jul;352:107481. doi: 10.1016/j.jmr.2023.107481. Epub 2023 May 20.
Recent advances in molecular modeling of protein structures are changing the field of structural biology. AlphaFold-2 (AF2), an AI system developed by DeepMind, Inc., utilizes attention-based deep learning to predict models of protein structures with high accuracy relative to structures determined by X-ray crystallography and cryo-electron microscopy (cryoEM). Comparing AF2 models to structures determined using solution NMR data, both high similarities and distinct differences have been observed. Since AF2 was trained on X-ray crystal and cryoEM structures, we assessed how accurately AF2 can model small, monomeric, solution protein NMR structures which (i) were not used in the AF2 training data set, and (ii) did not have homologous structures in the Protein Data Bank at the time of AF2 training. We identified nine open-source protein NMR data sets for such "blind" targets, including chemical shift, raw NMR FID data, NOESY peak lists, and (for 1 case) N-H residual dipolar coupling data. For these nine small (70-108 residues) monomeric proteins, we generated AF2 prediction models and assessed how well these models fit to these experimental NMR data, using several well-established NMR structure validation tools. In most of these cases, the AF2 models fit the NMR data nearly as well, or sometimes better than, the corresponding NMR structure models previously deposited in the Protein Data Bank. These results provide benchmark NMR data for assessing new NMR data analysis and protein structure prediction methods. They also document the potential for using AF2 as a guiding tool in protein NMR data analysis, and more generally for hypothesis generation in structural biology research.
近年来,蛋白质结构的分子建模方面的进展正在改变结构生物学领域。由 DeepMind, Inc. 开发的人工智能系统 AlphaFold-2 (AF2),利用基于注意力的深度学习,能够以相对 X 射线晶体学和低温电子显微镜 (cryoEM) 确定的结构更高的精度预测蛋白质结构模型。将 AF2 模型与使用溶液 NMR 数据确定的结构进行比较,观察到两者既有高度相似之处,也有明显差异。由于 AF2 是基于 X 射线晶体和 cryoEM 结构进行训练的,我们评估了 AF2 对小的、单体的、溶液状态下的蛋白质 NMR 结构进行建模的准确性,这些结构 (i) 未在 AF2 训练数据集中使用,并且 (ii) 在 AF2 训练时在蛋白质数据库中没有同源结构。我们确定了九个用于此类“盲”目标的开源蛋白质 NMR 数据集,包括化学位移、原始 NMR FID 数据、NOESY 峰列表,以及(对于 1 个案例)N-H 残差偶极耦合数据。对于这九个小的(70-108 个残基)单体蛋白质,我们生成了 AF2 预测模型,并使用几个经过良好验证的 NMR 结构验证工具评估了这些模型与这些实验 NMR 数据的拟合程度。在大多数情况下,AF2 模型对 NMR 数据的拟合程度与之前在蛋白质数据库中提交的相应 NMR 结构模型几乎一样,或者有时更好。这些结果为评估新的 NMR 数据分析和蛋白质结构预测方法提供了基准 NMR 数据。它们还证明了在蛋白质 NMR 数据分析中使用 AF2 作为指导工具的潜力,更广泛地说,在结构生物学研究中的假设生成方面具有潜力。