College of Computing and Software Engineering, Kennesaw State University, Marietta, GA 30060, USA.
Department of Computer Science, Bowling Green State University, Bowling Green, OH 43403, USA.
Int J Mol Sci. 2024 Aug 1;25(15):8426. doi: 10.3390/ijms25158426.
Protein structure prediction is important for understanding their function and behavior. This review study presents a comprehensive review of the computational models used in predicting protein structure. It covers the progression from established protein modeling to state-of-the-art artificial intelligence (AI) frameworks. The paper will start with a brief introduction to protein structures, protein modeling, and AI. The section on established protein modeling will discuss homology modeling, ab initio modeling, and threading. The next section is deep learning-based models. It introduces some state-of-the-art AI models, such as AlphaFold (AlphaFold, AlphaFold2, AlphaFold3), RoseTTAFold, ProteinBERT, etc. This section also discusses how AI techniques have been integrated into established frameworks like Swiss-Model, Rosetta, and I-TASSER. The model performance is compared using the rankings of CASP14 (Critical Assessment of Structure Prediction) and CASP15. CASP16 is ongoing, and its results are not included in this review. Continuous Automated Model EvaluatiOn (CAMEO) complements the biennial CASP experiment. Template modeling score (TM-score), global distance test total score (GDT_TS), and Local Distance Difference Test (lDDT) score are discussed too. This paper then acknowledges the ongoing difficulties in predicting protein structure and emphasizes the necessity of additional searches like dynamic protein behavior, conformational changes, and protein-protein interactions. In the application section, this paper introduces some applications in various fields like drug design, industry, education, and novel protein development. In summary, this paper provides a comprehensive overview of the latest advancements in established protein modeling and deep learning-based models for protein structure predictions. It emphasizes the significant advancements achieved by AI and identifies potential areas for further investigation.
蛋白质结构预测对于理解其功能和行为至关重要。本综述研究全面回顾了用于预测蛋白质结构的计算模型。它涵盖了从成熟的蛋白质建模到最先进的人工智能 (AI) 框架的发展。本文将首先简要介绍蛋白质结构、蛋白质建模和 AI。关于成熟的蛋白质建模部分将讨论同源建模、从头开始建模和穿线。下一节是基于深度学习的模型。它介绍了一些最先进的 AI 模型,如 AlphaFold(AlphaFold、AlphaFold2、AlphaFold3)、RoseTTAFold、ProteinBERT 等。本节还讨论了 AI 技术如何被集成到像 Swiss-Model、Rosetta 和 I-TASSER 等成熟框架中。模型性能使用 CASP14(结构预测的关键评估)和 CASP15 的排名进行比较。CASP16 正在进行中,其结果未包含在此综述中。持续自动模型评估 (CAMEO) 补充了两年一次的 CASP 实验。还讨论了模板建模得分 (TM-score)、全局距离测试总得分 (GDT_TS) 和局部距离差测试 (lDDT) 得分。本文随后承认在预测蛋白质结构方面仍存在困难,并强调需要进行其他搜索,如动态蛋白质行为、构象变化和蛋白质-蛋白质相互作用。在应用部分,本文介绍了在药物设计、工业、教育和新型蛋白质开发等各个领域的一些应用。总之,本文提供了对成熟的蛋白质建模和基于深度学习的蛋白质结构预测模型的最新进展的全面概述。它强调了 AI 取得的重大进展,并确定了进一步研究的潜在领域。