Suppr超能文献

使用结构感知图卷积网络预测和设计蛋白酶特异性

Prediction and Design of Protease Enzyme Specificity Using a Structure-Aware Graph Convolutional Network.

作者信息

Lu Changpeng, Lubin Joseph H, Sarma Vidur V, Stentz Samuel Z, Wang Guanyang, Wang Sijian, Khare Sagar D

机构信息

Institute for Quantitative Biomedicine, Rutgers - The State University of New Jersey, Piscataway, NJ.

Department of Chemistry & Chemical Biology, Rutgers - The State University of New Jersey, Piscataway, NJ.

出版信息

bioRxiv. 2023 Feb 16:2023.02.16.528728. doi: 10.1101/2023.02.16.528728.

Abstract

Site-specific proteolysis by the enzymatic cleavage of small linear sequence motifs is a key post-translational modification involved in physiology and disease. The ability to robustly and rapidly predict protease substrate specificity would also enable targeted proteolytic cleavage - editing - of a target protein by designed proteases. Current methods for predicting protease specificity are limited to sequence pattern recognition in experimentally-derived cleavage data obtained for libraries of potential substrates and generated separately for each protease variant. We reasoned that a more semantically rich and robust model of protease specificity could be developed by incorporating the three-dimensional structure and energetics of molecular interactions between protease and substrates into machine learning workflows. We present Protein Graph Convolutional Network (PGCN), which develops a physically-grounded, structure-based molecular interaction graph representation that describes molecular topology and interaction energetics to predict enzyme specificity. We show that PGCN accurately predicts the specificity landscapes of several variants of two model proteases: the NS3/4 protease from the Hepatitis C virus (HCV) and the Tobacco Etch Virus (TEV) proteases. Node and edge ablation tests identified key graph elements for specificity prediction, some of which are consistent with known biochemical constraints for protease:substrate recognition. We used a pre-trained PGCN model to guide the design of TEV protease libraries for cleaving two non-canonical substrates, and found good agreement with experimental cleavage results. Importantly, the model can accurately assess designs featuring diversity at positions not present in the training data. The described methodology should enable the structure-based prediction of specificity landscapes of a wide variety of proteases and the construction of tailor-made protease editors for site-selectively and irreversibly modifying chosen target proteins.

摘要

通过对小线性序列基序进行酶切实现的位点特异性蛋白水解是一种关键的翻译后修饰,涉及生理和疾病过程。能够强大且快速地预测蛋白酶底物特异性,还将使通过设计的蛋白酶对目标蛋白进行靶向蛋白水解切割(编辑)成为可能。当前预测蛋白酶特异性的方法仅限于在从潜在底物库获得的实验性切割数据中进行序列模式识别,并且是针对每个蛋白酶变体单独生成的。我们推断,通过将蛋白酶与底物之间分子相互作用的三维结构和能量学纳入机器学习工作流程,可以开发出一个语义更丰富、更强大的蛋白酶特异性模型。我们提出了蛋白质图卷积网络(PGCN),它开发了一种基于物理、基于结构的分子相互作用图表示法,描述分子拓扑和相互作用能量学以预测酶的特异性。我们表明,PGCN能够准确预测两种模型蛋白酶的几种变体的特异性图谱:丙型肝炎病毒(HCV)的NS3/4蛋白酶和烟草蚀纹病毒(TEV)蛋白酶。节点和边的消融测试确定了特异性预测的关键图元素,其中一些与蛋白酶:底物识别的已知生化限制一致。我们使用预训练的PGCN模型指导设计用于切割两种非经典底物的TEV蛋白酶库,并发现与实验切割结果高度吻合。重要的是,该模型可以准确评估在训练数据中不存在的位置具有多样性的设计。所描述的方法应该能够基于结构预测多种蛋白酶的特异性图谱,并构建定制的蛋白酶编辑器,用于位点选择性和不可逆地修饰选定的目标蛋白。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f442/9949123/2f21f7991a75/nihpp-2023.02.16.528728v1-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验