Computational Structural Genomics Unit, Linda T. and John A. Mellowes Center for Genomic Sciences and Precision Medicine, Medical College of Wisconsin, Milwaukee, WI, United States of America.
Clinical and Translational Sciences Institute, Medical College of Wisconsin, Milwaukee, WI, United States of America.
PLoS One. 2024 Nov 26;19(11):e0313308. doi: 10.1371/journal.pone.0313308. eCollection 2024.
Artificial Intelligence (AI)-based deep learning methods for predicting protein structures are reshaping knowledge development and scientific discovery. Recent large-scale application of AI models for protein structure prediction has changed perceptions about complicated biological problems and empowered a new generation of structure-based hypothesis testing. It is well-recognized that proteins have a modular organization according to archetypal folds. However, it is yet to be determined if predicted structures are tuned to one conformation of flexible proteins or if they represent average conformations. Further, whether or not the answer is protein fold-dependent. Therefore, in this study, we analyzed 2878 proteins with at least ten distinct experimental structures available, from which we can estimate protein topological rigidity verses heterogeneity from experimental measurements. We found that AlphaFold v2 (AF2) predictions consistently return one specific form to high accuracy, with 99.68% of distinct folds (n = 623 out of 628) having an experimental structure within 2.5Å RMSD from a predicted structure. Yet, 27.70% and 10.82% of folds (174 and 68 out of 628 folds) have at least one experimental structure over 2.5Å and 5Å RMSD, respectively, from their AI-predicted structure. This information is important for how researchers apply and interpret the output of AF2 and similar tools. Additionally, it enabled us to score fold types according to how homogeneous versus heterogeneous their conformations are. Importantly, folds with high heterogeneity are enriched among proteins which regulate vital biological processes including immune cell differentiation, immune activation, and metabolism. This result demonstrates that a large amount of protein fold flexibility has already been experimentally measured, is vital for critical cellular processes, and is currently unaccounted for in structure prediction databases. Therefore, the structure-prediction revolution begets the protein dynamics revolution!
基于人工智能的深度学习方法在预测蛋白质结构方面取得了突破,正在改变人们对复杂生物学问题的认识,并为新一代基于结构的假设检验提供了支持。人们普遍认为,蛋白质具有根据典型折叠方式进行模块化组织的特性。然而,目前还不清楚预测结构是否针对柔性蛋白质的一种构象进行调整,或者它们是否代表平均构象。此外,答案是否取决于蛋白质折叠。因此,在这项研究中,我们分析了 2878 种具有至少十种不同实验结构的蛋白质,从中可以根据实验测量结果来估计蛋白质拓扑刚性和异质性。我们发现,AlphaFold v2 (AF2) 预测结果始终以高精度返回一种特定形式,99.68%(n = 623 种)的独特折叠类型在 2.5Å RMSD 内具有与其预测结构相同的实验结构。然而,27.70%(174 种)和 10.82%(68 种)的折叠类型在其预测结构上至少有一个实验结构的 RMSD 值超过 2.5Å 和 5Å。这些信息对于研究人员如何应用和解释 AF2 和类似工具的输出结果非常重要。此外,这使我们能够根据构象的同质性和异质性对折叠类型进行评分。重要的是,具有高度异质性的折叠类型在调节包括免疫细胞分化、免疫激活和代谢等重要生物过程的蛋白质中富集。这一结果表明,大量的蛋白质折叠灵活性已经在实验中得到了测量,对关键的细胞过程至关重要,但在结构预测数据库中尚未得到考虑。因此,结构预测的革命带来了蛋白质动力学的革命!