Protein Structure Prediction Center and Genome Center, University of California, Davis, Davis, CA 95616, USA,
Pac Symp Biocomput. 2022;27:1-9.
The last few years mark dramatic improvements in modeling of protein structure. Progress was initially due to breakthroughs in residue-residue contact prediction, first with global statistical models and later with deep learning. These advancements were then followed by an even broader application of the deep learning techniques to the protein structure modeling itself, first using Convolutional Neural Networks (CNNs) and then switching to Natural Language Processing (NLP), including Attention models, and to Geometric Deep Learning (GDL). The accuracy of protein structure models generated with current state-of-the-art methods rivals that of experimental structures, while models themselves are used to solve structures or to make them more accurate.Looking at the near future of machine learning applications in structural biology, we ask the following questions: Which specific problems should we expect to be solved next? Which new methods will prove to be the most effective? Which actions are likely to stimulate further progress the most? In addressing these questions, we invite the 2022 PSB attendees to actively participate in session discussions.The AI-driven Advances in Modeling of Protein Structure session includes five papers specifically dedicated to:Evaluating the significance of training data selection in machine learning.Geometric pattern transferability, from protein self-interactions to protein-ligand interactions.Supervised versus unsupervised sequence to contact learning, using attention models.Side chain packing using SE(3) transformers.Feature detection in electrostatic representations of ligand binding sites.
过去几年中,蛋白质结构建模取得了显著的进展。最初的进展归功于残基残基接触预测方面的突破,先是全局统计模型,然后是深度学习。随后,深度学习技术在蛋白质结构建模本身的应用更加广泛,首先是使用卷积神经网络(CNN),然后转向自然语言处理(NLP),包括注意力模型,以及几何深度学习(GDL)。使用当前最先进的方法生成的蛋白质结构模型的准确性可与实验结构相媲美,而模型本身则用于解决结构问题或提高其准确性。展望机器学习在结构生物学中的应用的未来,我们提出以下问题:我们预计接下来会解决哪些具体问题?哪些新方法将被证明最有效?哪些行动最有可能推动进一步的进展?在回答这些问题时,我们邀请 2022 年 PSB 会议的与会者积极参与会议讨论。“人工智能驱动的蛋白质结构建模进展”会议包括五篇专门讨论以下问题的论文:评估机器学习中训练数据选择的重要性。从蛋白质自相互作用到蛋白质-配体相互作用的几何模式可转移性。使用注意力模型进行监督与无监督的序列到接触学习。使用 SE(3) 转换器进行侧链包装。在配体结合位点的静电表示中进行特征检测。