Department of Pharmaceutical and Biomedical Sciences, University of Georgia, Athens, GA 30602, USA.
Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA.
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac733.
As more data of experimentally determined protein structures are becoming available, data-driven models to describe protein sequence-structure relationships become more feasible. Within this space, the amino acid sequence design of protein-protein interactions is still a rather challenging subproblem with very low success rates-yet, it is central to most biological processes.
We developed an attention-based deep learning model inspired by algorithms used for image-caption assignments to design peptides or protein fragment sequences. Our trained model can be applied for the redesign of natural protein interfaces or the designed protein interaction fragments. Here, we validate the potential by recapitulating naturally occurring protein-protein interactions including antibody-antigen complexes. The designed interfaces accurately capture essential native interactions and have comparable native-like binding affinities in silico. Furthermore, our model does not need a precise backbone location, making it an attractive tool for working with de novo design of protein-protein interactions.
The source code of the method is available at https://github.com/strauchlab/iNNterfaceDesign.
Supplementary data are available at Bioinformatics online.
随着越来越多的实验确定的蛋白质结构数据的出现,用于描述蛋白质序列-结构关系的数据驱动模型变得更加可行。在这个领域中,蛋白质-蛋白质相互作用的氨基酸序列设计仍然是一个具有非常低成功率的极具挑战性的子问题 - 然而,它是大多数生物过程的核心。
我们开发了一种基于注意力的深度学习模型,该模型受到用于图像字幕分配的算法的启发,用于设计肽或蛋白质片段序列。我们训练的模型可用于重新设计天然蛋白质界面或设计的蛋白质相互作用片段。在这里,我们通过重现包括抗体-抗原复合物在内的自然发生的蛋白质-蛋白质相互作用来验证这种方法的潜力。设计的界面准确地捕获了基本的天然相互作用,并具有可比较的天然结合亲和力。此外,我们的模型不需要精确的骨架位置,使其成为从头设计蛋白质-蛋白质相互作用的有吸引力的工具。
该方法的源代码可在 https://github.com/strauchlab/iNNterfaceDesign 上获得。
补充数据可在 Bioinformatics 在线获得。