Suppr超能文献

迁移学习以利用更大的数据集来改进对蛋白质稳定性变化的预测。

Transfer learning to leverage larger datasets for improved prediction of protein stability changes.

作者信息

Dieckhaus Henry, Brocidiacono Michael, Randolph Nicholas, Kuhlman Brian

机构信息

Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA.

Division of Chemical Biology and Medicinal Chemistry, University of North Carolina Eshelman School of Pharmacy, Chapel Hill, North Carolina, USA.

出版信息

bioRxiv. 2023 Jul 30:2023.07.27.550881. doi: 10.1101/2023.07.27.550881.

Abstract

Amino acid mutations that lower a protein's thermodynamic stability are implicated in numerous diseases, and engineered proteins with enhanced stability are important in research and medicine. Computational methods for predicting how mutations perturb protein stability are therefore of great interest. Despite recent advancements in protein design using deep learning, prediction of stability changes has remained challenging, in part due to a lack of large, high-quality training datasets for model development. Here we introduce ThermoMPNN, a deep neural network trained to predict stability changes for protein point mutations given an initial structure. In doing so, we demonstrate the utility of a newly released mega-scale stability dataset for training a robust stability model. We also employ transfer learning to leverage a second, larger dataset by using learned features extracted from a deep neural network trained to predict a protein's amino acid sequence given its three-dimensional structure. We show that our method achieves competitive performance on established benchmark datasets using a lightweight model architecture that allows for rapid, scalable predictions. Finally, we make ThermoMPNN readily available as a tool for stability prediction and design.

摘要

降低蛋白质热力学稳定性的氨基酸突变与多种疾病有关,而具有增强稳定性的工程蛋白在研究和医学中具有重要意义。因此,预测突变如何扰乱蛋白质稳定性的计算方法备受关注。尽管最近在利用深度学习进行蛋白质设计方面取得了进展,但稳定性变化的预测仍然具有挑战性,部分原因是缺乏用于模型开发的大规模、高质量训练数据集。在这里,我们介绍了ThermoMPNN,这是一种深度神经网络,经过训练可以在给定初始结构的情况下预测蛋白质点突变的稳定性变化。在此过程中,我们展示了新发布的大规模稳定性数据集在训练强大的稳定性模型方面的效用。我们还采用迁移学习,通过使用从经过训练以根据蛋白质的三维结构预测其氨基酸序列的深度神经网络中提取的学习特征,来利用第二个更大的数据集。我们表明,我们的方法使用轻量级模型架构在既定的基准数据集上实现了有竞争力的性能,该架构允许进行快速、可扩展的预测。最后,我们将ThermoMPNN作为一种稳定性预测和设计工具提供。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf6/10402116/a49e13577e46/nihpp-2023.07.27.550881v1-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验