Liu Chen, Li Mingchen, Tan Yang, Gou Wenrui, Fan Guisheng, Zhou Bingxin
School of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China.
Institute of Natural Sciences, Shanghai Jiao Tong University, Shanghai 200240, China.
Bioinformatics. 2025 Aug 2;41(8). doi: 10.1093/bioinformatics/btaf446.
A pivotal area of research in antibody engineering is to find effective modifications that enhance antibody-antigen binding affinity. Traditional wet-lab experiments assess mutants in a costly and time-consuming manner. Emerging deep learning solutions offer an alternative by modeling antibody structures to predict binding affinity changes. However, they heavily depend on high-quality complex structures, which are frequently unavailable in practice. Therefore, we propose ProtAttBA, a deep learning model that predicts binding affinity changes based solely on the sequence information of antibody-antigen complexes.
ProtAttBA employs a pre-training phase to learn protein sequence patterns, following a supervised training phase using labeled antibody-antigen complex data to train a cross-attention-based regressor for predicting binding affinity changes. We evaluated ProtAttBA on three open benchmarks under different conditions. Compared to both sequence- and structure-based prediction methods, our approach achieves competitive performance, demonstrating notable robustness, especially with uncertain complex structures. Notably, our method possesses interpretability from the attention mechanism. We show that the learned attention scores can identify critical residues with impacts on binding affinity. This work introduces a rapid and cost-effective computational tool for antibody engineering, with the potential to accelerate the development of novel therapeutic antibodies.
Source codes and data are available at https://github.com/code4luck/ProtAttBA.
抗体工程研究的一个关键领域是找到能增强抗体 - 抗原结合亲和力的有效修饰方法。传统的湿实验室实验以昂贵且耗时的方式评估突变体。新兴的深度学习解决方案通过对抗体结构进行建模来预测结合亲和力变化,提供了一种替代方法。然而,它们严重依赖高质量的复合物结构,而在实际中这些结构往往难以获得。因此,我们提出了ProtAttBA,一种仅基于抗体 - 抗原复合物的序列信息来预测结合亲和力变化的深度学习模型。
ProtAttBA采用预训练阶段来学习蛋白质序列模式,随后是一个监督训练阶段,使用标记的抗体 - 抗原复合物数据来训练基于交叉注意力的回归器,以预测结合亲和力变化。我们在不同条件下的三个公开基准上评估了ProtAttBA。与基于序列和基于结构的预测方法相比,我们的方法取得了具有竞争力的性能,显示出显著的稳健性,特别是在复合物结构不确定的情况下。值得注意的是,我们的方法具有来自注意力机制的可解释性。我们表明,学习到的注意力分数可以识别对结合亲和力有影响的关键残基。这项工作为抗体工程引入了一种快速且经济高效的计算工具,有可能加速新型治疗性抗体的开发。