Feng Hongsong, Zhao Jeffrey Y, Wei Guo-Wei
Department of Mathematics, Michigan State University, East Lansing, Michigan, USA.
Vestavia Hills High School, Vestavia Hills, Alabama, USA.
J Comput Chem. 2025 Mar 15;46(7):e70073. doi: 10.1002/jcc.70073.
Protein structural fluctuations, measured by Debye-Waller factors or B-factors, are known to be closely associated with protein flexibility and function. Theoretical approaches have also been developed to predict B-factor values, which reflect protein flexibility. Previous models have made significant strides in analyzing B-factors by fitting experimental data. In this study, we propose a novel approach for B-factor prediction using differential geometry theory, based on the assumption that the intrinsic properties of proteins reside on a family of low-dimensional manifolds embedded within the high-dimensional space of protein structures. By analyzing the mean and Gaussian curvatures of a set of low-dimensional manifolds defined by kernel functions, we develop effective and robust multiscale differential geometry (mDG) models. Our mDG model demonstrates a 27% increase in accuracy compared to the classical Gaussian network model (GNM) in predicting B-factors for a dataset of 364 proteins. Additionally, by incorporating both global and local protein features, we construct a highly effective machine-learning model for the blind prediction of B-factors. Extensive least-squares approximations and machine learning-based blind predictions validate the effectiveness of the mDG modeling approach for B-factor predictions.
通过德拜-瓦勒因子或B因子测量的蛋白质结构波动,已知与蛋白质的灵活性和功能密切相关。人们也已开发出理论方法来预测反映蛋白质灵活性的B因子值。先前的模型在通过拟合实验数据分析B因子方面取得了重大进展。在本研究中,我们基于蛋白质的内在特性存在于嵌入蛋白质结构高维空间的低维流形族上这一假设,提出了一种使用微分几何理论进行B因子预测的新方法。通过分析由核函数定义的一组低维流形的平均曲率和高斯曲率,我们开发了有效且稳健的多尺度微分几何(mDG)模型。在预测364个蛋白质数据集的B因子时,我们的mDG模型与经典高斯网络模型(GNM)相比,准确率提高了27%。此外,通过纳入蛋白质的全局和局部特征,我们构建了一个用于B因子盲预测的高效机器学习模型。广泛的最小二乘近似和基于机器学习的盲预测验证了mDG建模方法用于B因子预测的有效性。