Ovchinnikov Victor, Karplus Martin
Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts, USA.
Laboratoire de Chimie Biophysique, ISIS, Université de Strasbourg, Strasbourg, France.
J Comput Chem. 2025 Jan 5;46(1):e27512. doi: 10.1002/jcc.27512. Epub 2024 Oct 15.
Prediction of protein fitness from computational modeling is an area of active research in rational protein design. Here, we investigated whether protein fluctuations computed from molecular dynamics simulations can be used to predict the expression levels of SARS-CoV-2 receptor binding domain (RBD) mutants determined in the deep mutational scanning experiment of Starr et al. [Science (New York, N.Y.) 2022, 377, 420] Specifically, we performed more than 0.7 milliseconds of molecular dynamics (MD) simulations of 557 mutant RBDs in triplicate to achieve statistical significance under various simulation conditions. Our results show modest but significant anticorrelation in the range [-0.4, -0.3] between expression and RBD protein flexibility. A simple linear regression machine learning model achieved correlation coefficients in the range [0.7, 0.8], thus outperforming MD-based models, but required about 25 mutations at each residue position for training.
通过计算建模预测蛋白质适应性是理性蛋白质设计中一个活跃的研究领域。在此,我们研究了从分子动力学模拟计算得到的蛋白质波动是否可用于预测在Starr等人的深度突变扫描实验[《科学》(纽约州纽约市)2022年,377卷,420页]中测定的SARS-CoV-2受体结合域(RBD)突变体的表达水平。具体而言,我们对557个突变RBD进行了一式三份的超过0.7毫秒的分子动力学(MD)模拟,以在各种模拟条件下达到统计学显著性。我们的结果表明,表达与RBD蛋白质灵活性之间在[-0.4, -0.3]范围内存在适度但显著的反相关。一个简单的线性回归机器学习模型实现了[0.7, 0.8]范围内的相关系数,从而优于基于MD的模型,但在每个残基位置进行训练需要约25个突变。