Department of Control Engineering, Northeastern University, Qinhuangdao, Hebei, China.
Department of Naval Architecture and Marine Engineering, University of Michigan, Ann Arbor, Michigan, USA.
Protein Sci. 2022 Nov;31(11):e4467. doi: 10.1002/pro.4467.
Predicting protein thermostability change upon mutation is crucial for understanding diseases and designing therapeutics. However, accurately estimating Gibbs free energy change of the protein remained a challenge. Some methods struggle to generalize on examples with no homology and produce uncalibrated predictions. Here we leverage advances in graph neural networks for protein feature extraction to tackle this structure-property prediction task. Our method, BayeStab, is then tested on four test datasets, including S669, S611, S350, and Myoglobin, showing high generalization and symmetry performance. Meanwhile, we apply concrete dropout enabled Bayesian neural networks to infer plausible models and estimate uncertainty. By decomposing the uncertainty into parts induced by data noise and model, we demonstrate that the probabilistic method allows insights into the inherent noise of the training datasets, which is closely relevant to the upper bound of the task. Finally, the BayeStab web server is created and can be found at: http://www.bayestab.com. The code for this work is available at: https://github.com/HongzhouTang/BayeStab.
预测蛋白质突变后的热稳定性变化对于理解疾病和设计治疗方法至关重要。然而,准确估计蛋白质的吉布斯自由能变化仍然是一个挑战。一些方法难以推广到没有同源性的例子,并产生未经校准的预测。在这里,我们利用图神经网络在蛋白质特征提取方面的进展来解决这个结构-性质预测任务。我们的方法 BayeStab 随后在四个测试数据集上进行了测试,包括 S669、S611、S350 和肌红蛋白,显示出了很高的泛化和对称性能。同时,我们应用具体的 dropout 实现的贝叶斯神经网络来推断合理的模型并估计不确定性。通过将不确定性分解为数据噪声和模型引起的部分,我们证明概率方法可以深入了解训练数据集的固有噪声,这与任务的上限密切相关。最后,创建了 BayeStab 网络服务器,可以在:http://www.bayestab.com 找到。这项工作的代码可以在:https://github.com/HongzhouTang/BayeStab 找到。