Durairaj Janani, de Ridder Dick, van Dijk Aalt D J
Biozentrum, University of Basel, Basel, Switzerland.
Bioinformatics Group, Department of Plant Sciences, Wageningen University and Research, Wageningen, the Netherlands.
Comput Struct Biotechnol J. 2022 Dec 29;21:630-643. doi: 10.1016/j.csbj.2022.12.039. eCollection 2023.
Recent breakthroughs in protein structure prediction demarcate the start of a new era in structural bioinformatics. Combined with various advances in experimental structure determination and the uninterrupted pace at which new structures are published, this promises an age in which protein structure information is as prevalent and ubiquitous as sequence. Machine learning in protein bioinformatics has been dominated by sequence-based methods, but this is now changing to make use of the deluge of rich structural information as input. Machine learning methods making use of structures are scattered across literature and cover a number of different applications and scopes; while some try to address questions and tasks within a single protein family, others aim to capture characteristics across all available proteins. In this review, we look at the variety of structure-based machine learning approaches, how structures can be used as input, and typical applications of these approaches in protein biology. We also discuss current challenges and opportunities in this all-important and increasingly popular field.
蛋白质结构预测方面的最新突破标志着结构生物信息学新时代的开始。结合实验结构测定的各种进展以及新结构发表的持续速度,这预示着一个蛋白质结构信息将像序列一样普遍和无处不在的时代。蛋白质生物信息学中的机器学习一直以基于序列的方法为主,但现在这种情况正在改变,开始利用大量丰富的结构信息作为输入。利用结构的机器学习方法分散在文献中,涵盖了许多不同的应用和范围;有些方法试图解决单个蛋白质家族中的问题和任务,而另一些则旨在捕捉所有可用蛋白质的特征。在这篇综述中,我们将探讨各种基于结构的机器学习方法、结构如何用作输入以及这些方法在蛋白质生物学中的典型应用。我们还将讨论这个极其重要且日益流行的领域当前面临的挑战和机遇。