Department of Biochemical Engineering, University College London, Gower Street, WC1E 6BT London, U.K.
Department of Physics, University of Cagliari, Cittadella Universitaria, I-09042 Monserrato, Cagliari, Italy.
J Chem Inf Model. 2024 Apr 8;64(7):2681-2694. doi: 10.1021/acs.jcim.3c00999. Epub 2024 Feb 22.
Despite recent advances in computational protein science, the dynamic behavior of proteins, which directly governs their biological activity, cannot be gleaned from sequence information alone. To overcome this challenge, we propose a framework that integrates the peptide sequence, protein structure, and protein dynamics descriptors into machine learning algorithms to enhance their predictive capabilities and achieve improved prediction of the protein variant function. The resulting machine learning pipeline integrates traditional sequence and structure information with molecular dynamics simulation data to predict the effects of multiple point mutations on the fold improvement of the activity of bovine enterokinase variants. This study highlights how the combination of structural and dynamic data can provide predictive insights into protein functionality and address protein engineering challenges in industrial contexts.
尽管计算蛋白质科学最近取得了进展,但蛋白质的动态行为直接决定了它们的生物活性,仅凭序列信息无法获取。为了克服这一挑战,我们提出了一个框架,将肽序列、蛋白质结构和蛋白质动力学描述符集成到机器学习算法中,以提高它们的预测能力,并实现对蛋白质变体功能的改进预测。由此产生的机器学习管道将传统的序列和结构信息与分子动力学模拟数据相结合,以预测多个点突变对牛肠激酶变体活性折叠改善的影响。这项研究强调了结构和动态数据的结合如何为蛋白质功能提供预测性见解,并解决工业环境中蛋白质工程的挑战。