Ertelt Moritz, Schlegel Phillip, Beining Max, Kaysser Leonard, Meiler Jens, Schoeder Clara T
Institute for Drug Discovery, Leipzig University, Faculty of Medicine, Leipzig, Germany.
Center for Scalable Data Analytics and Artificial Intelligence ScaDS.AI, Dresden/Leipzig, Germany.
bioRxiv. 2024 Dec 1:2024.11.26.625397. doi: 10.1101/2024.11.26.625397.
Stability is a key factor to enable the use of recombinant proteins in therapeutic or biotechnological applications. Deep learning protein design approaches like ProteinMPNN have shown strong performance both in creating novel proteins or stabilizing existing ones. However, it is unlikely that the stability of the designs will significantly exceed that of the natural proteins in the training set, which are biophysically only marginally stable. Therefore, we collected predicted protein structures from hyperthermophiles, which differ substantially in their amino acid composition from mesophiles. Notably, ProteinMPNN fails to recover their unique amino acid composition. Here we show that a retrained network on predicted proteins from hyperthermophiles, termed HyperMPNN, not only recovers this unique amino acid composition but can also be applied to proteins from non-hyperthermophiles. Using this novel approach on a protein nanoparticle with a melting temperature of 65°C resulted in designs remaining stable at 95°C. In conclusion, we created a new way to design highly thermostable proteins through self-supervised learning on data from hyperthermophiles.
稳定性是使重组蛋白能够用于治疗或生物技术应用的关键因素。像ProteinMPNN这样的深度学习蛋白质设计方法在创造新型蛋白质或稳定现有蛋白质方面都表现出了强大的性能。然而,设计的蛋白质的稳定性不太可能显著超过训练集中天然蛋白质的稳定性,而这些天然蛋白质在生物物理方面只是略微稳定。因此,我们收集了嗜热菌的预测蛋白质结构,它们的氨基酸组成与嗜温菌有很大不同。值得注意的是,ProteinMPNN无法恢复它们独特的氨基酸组成。在这里我们表明,在嗜热菌预测蛋白质上重新训练的网络,称为HyperMPNN,不仅可以恢复这种独特的氨基酸组成,还可以应用于非嗜热菌的蛋白质。将这种新方法应用于熔点为65°C的蛋白质纳米颗粒,得到的设计在95°C时仍保持稳定。总之,我们通过对嗜热菌数据进行自监督学习,创造了一种设计高度耐热蛋白质的新方法。