Suppr超能文献

蛋白质稳定性模型无法捕捉双点突变的上位性相互作用。

Protein stability models fail to capture epistatic interactions of double point mutations.

作者信息

Dieckhaus Henry, Kuhlman Brian

机构信息

Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA.

Division of Chemical Biology and Medicinal Chemistry, University of North Carolina Eshelman School of Pharmacy, Chapel Hill, North Carolina, USA.

出版信息

bioRxiv. 2024 Aug 21:2024.08.20.608844. doi: 10.1101/2024.08.20.608844.

Abstract

There is strong interest in accurate methods for predicting changes in protein stability resulting from amino acid mutations to the protein sequence. Recombinant proteins must often be stabilized to be used as therapeutics or reagents, and destabilizing mutations are implicated in a variety of diseases. Due to increased data availability and improved modeling techniques, recent studies have shown advancements in predicting changes in protein stability when a single point mutation is made. Less focus has been directed toward predicting changes in protein stability when there are two or more mutations, despite the significance of mutation clusters for disease pathways and protein design studies. Here, we analyze the largest available dataset of double point mutation stability and benchmark several widely used protein stability models on this and other datasets. We identify a blind spot in how predictors are typically evaluated on multiple mutations, finding that, contrary to assumptions in the field, current stability models are unable to consistently capture epistatic interactions between double mutations. We observe one notable deviation from this trend, which is that epistasis-aware models provide marginally better predictions on stabilizing double point mutations. We develop an extension of the ThermoMPNN framework for double mutant modeling as well as a novel data augmentation scheme which mitigates some of the limitations in available datasets. Collectively, our findings indicate that current protein stability models fail to capture the nuanced epistatic interactions between concurrent mutations due to several factors, including training dataset limitations and insufficient model sensitivity.

摘要

人们对预测氨基酸突变导致蛋白质稳定性变化的准确方法有着浓厚兴趣。重组蛋白通常必须进行稳定化处理才能用作治疗药物或试剂,而不稳定突变与多种疾病有关。由于数据可用性的提高和建模技术的改进,最近的研究表明在预测单点突变时蛋白质稳定性的变化方面取得了进展。尽管突变簇对疾病途径和蛋白质设计研究具有重要意义,但较少关注预测两个或更多突变时蛋白质稳定性的变化。在此,我们分析了最大的双点突变稳定性可用数据集,并在该数据集及其他数据集上对几种广泛使用的蛋白质稳定性模型进行了基准测试。我们发现了预测器在多个突变上通常评估方式中的一个盲点,即与该领域的假设相反,当前的稳定性模型无法一致地捕捉双突变之间的上位性相互作用。我们观察到一个明显偏离此趋势的情况,即考虑上位性的模型对稳定双点突变的预测略好一些。我们开发了用于双突变体建模的ThermoMPNN框架扩展以及一种新颖的数据增强方案,该方案减轻了现有数据集中的一些局限性。总体而言,我们的研究结果表明,由于包括训练数据集限制和模型敏感性不足等几个因素,当前的蛋白质稳定性模型无法捕捉并发突变之间细微的上位性相互作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8afb/11370451/a7a6abe9d294/nihpp-2024.08.20.608844v1-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验