Lobzaev Evgenii, Herrera Michael A, Kasprzyk Martyna, Stracquadanio Giovanni
School of Biological Sciences, The University of Edinburgh, Edinburgh, United Kingdom.
chool of Informatics, The University of Edinburgh, Edinburgh, United Kingdom.
Nat Commun. 2024 Dec 1;15(1):10447. doi: 10.1038/s41467-024-54814-w.
Engineering proteins is a challenging task requiring the exploration of a vast design space. Traditionally, this is achieved using Directed Evolution (DE), which is a laborious process. Generative deep learning, instead, can learn biological features of functional proteins from sequence and structural datasets and return novel variants. However, most models do not generate thermodynamically stable proteins, thus leading to many non-functional variants. Here we propose a model called PRotein Engineering by Variational frEe eNergy approximaTion (PREVENT), which generates stable and functional variants by learning the sequence and thermodynamic landscape of a protein. We evaluate PREVENT by designing 40 variants of the conditionally essential E. coli phosphotransferase N-acetyl-L-glutamate kinase (EcNAGK). We find 85% of the variants to be functional, with 55% of them showing similar growth rate compared to the wildtype enzyme, despite harbouring up to 9 mutations. Our results support a new approach that can significantly accelerate protein engineering.
工程化改造蛋白质是一项具有挑战性的任务,需要探索广阔的设计空间。传统上,这是通过定向进化(DE)来实现的,这是一个费力的过程。相反,生成式深度学习可以从序列和结构数据集中学习功能蛋白质的生物学特征,并返回新的变体。然而,大多数模型不会生成热力学稳定的蛋白质,从而导致许多无功能的变体。在这里,我们提出了一种名为通过变分自由能近似进行蛋白质工程(PREVENT)的模型,该模型通过学习蛋白质的序列和热力学景观来生成稳定且有功能的变体。我们通过设计条件必需的大肠杆菌磷酸转移酶N-乙酰-L-谷氨酸激酶(EcNAGK)的40个变体来评估PREVENT。我们发现85%的变体具有功能,其中55%与野生型酶相比显示出相似的生长速率,尽管它们含有多达9个突变。我们的结果支持了一种可以显著加速蛋白质工程的新方法。