Department, University of Missouri, Columbia, MO 65211, USA.
IEEE Rev Biomed Eng. 2008;1:41-9. doi: 10.1109/RBME.2008.2008239.
Machine learning methods are widely used in bioinformatics and computational and systems biology. Here, we review the development of machine learning methods for protein structure prediction, one of the most fundamental problems in structural biology and bioinformatics. Protein structure prediction is such a complex problem that it is often decomposed and attacked at four different levels: 1-D prediction of structural features along the primary sequence of amino acids; 2-D prediction of spatial relationships between amino acids; 3-D prediction of the tertiary structure of a protein; and 4-D prediction of the quaternary structure of a multiprotein complex. A diverse set of both supervised and unsupervised machine learning methods has been applied over the years to tackle these problems and has significantly contributed to advancing the state-of-the-art of protein structure prediction. In this paper, we review the development and application of hidden Markov models, neural networks, support vector machines, Bayesian methods, and clustering methods in 1-D, 2-D, 3-D, and 4-D protein structure predictions.
机器学习方法在生物信息学和计算与系统生物学中得到了广泛应用。在这里,我们回顾了机器学习方法在蛋白质结构预测方面的发展,这是结构生物学和生物信息学中最基本的问题之一。蛋白质结构预测是一个非常复杂的问题,通常可以在四个不同的层次上进行分解和攻击:1-D 预测氨基酸一级序列的结构特征;2-D 预测氨基酸之间的空间关系;3-D 预测蛋白质的三级结构;以及 4-D 预测多蛋白复合物的四级结构。多年来,人们应用了各种各样的有监督和无监督机器学习方法来解决这些问题,这些方法显著推动了蛋白质结构预测的技术发展。在本文中,我们回顾了隐马尔可夫模型、神经网络、支持向量机、贝叶斯方法和聚类方法在 1-D、2-D、3-D 和 4-D 蛋白质结构预测中的发展和应用。