加速抗体开发：通过尺寸排阻色谱预测可开发性特性的基于序列和结构的模型

Accelerating antibody development: sequence and structure-based models for predicting developability properties via size exclusion chromatography.

作者信息

Abeer A N M Nafiz, Boroumand Mehdi, Sermadiras Isabelle, Caldwell Jenna G, Stanev Valentin, Mody Neil, Kaplan Gilad, Savery James, Croasdale-Wood Rebecca, Pouryahya Maryam

机构信息

Data Science and Modelling, BioPharmaceuticals R&D, AstraZeneca, Gaithersburg, MD, USA.

Department of Electrical and Computer Engineering, Texas A&M University, R&D, AstraZeneca, College Station, TX, USA.

出版信息

MAbs. 2025 Dec;17(1):2562997. doi: 10.1080/19420862.2025.2562997. Epub 2025 Sep 26.

DOI:10.1080/19420862.2025.2562997

PMID:41004127

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12477876/

Abstract

Experimental screening for biopharmaceutical developability properties typically relies on resource-intensive, and time-consuming assays such as size exclusion chromatography (SEC). This study highlights the potential of in silico models to accelerate the screening process by exploring sequence and structure-based machine learning techniques. Specifically, we compared surrogate models based on pre-computed features extracted from sequence and predicted structure with sequence-based approaches using protein language models (PLMs) like ESM-2. In addition to different end-to-end fine-tuning strategies for PLM, we have also investigated the integration of the structural information of the antibodies into the prediction pipeline through graph neural networks (GNN). We applied these different methods for predicting protein aggregation propensity using a dataset of approximately 1200 Immunoglobulin G (IgG1) molecules. Through this empirical evaluation, our study identifies the most effective in silico approach for predicting developability properties for SEC assays, thereby adding insights to existing screening efforts for accelerating the antibody development process.

摘要

生物制药可开发性特性的实验筛选通常依赖于资源密集型且耗时的分析方法，如尺寸排阻色谱法（SEC）。本研究通过探索基于序列和结构的机器学习技术，突出了计算机模拟模型在加速筛选过程方面的潜力。具体而言，我们将基于从序列和预测结构中提取的预计算特征的替代模型与使用诸如ESM-2等蛋白质语言模型（PLM）的基于序列的方法进行了比较。除了针对PLM的不同端到端微调策略外，我们还研究了通过图神经网络（GNN）将抗体的结构信息整合到预测流程中。我们使用一个包含约1200个免疫球蛋白G（IgG1）分子的数据集，应用这些不同方法预测蛋白质聚集倾向。通过这一实证评估，我们的研究确定了预测SEC分析可开发性特性最有效的计算机模拟方法，从而为加速抗体开发过程的现有筛选工作提供了见解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6828/12477876/9742833c593e/KMAB_A_2562997_F0001_OC.jpg

相似文献

Accelerating antibody development: sequence and structure-based models for predicting developability properties via size exclusion chromatography.加速抗体开发：通过尺寸排阻色谱预测可开发性特性的基于序列和结构的模型

MAbs. 2025 Dec;17(1):2562997. doi: 10.1080/19420862.2025.2562997. Epub 2025 Sep 26.

Vesicoureteral Reflux膀胱输尿管反流

Distilling knowledge from graph neural networks trained on cell graphs to non-neural student models.从在细胞图上训练的图神经网络中提取知识，用于非神经学生模型。

Sci Rep. 2025 Aug 10;15(1):29274. doi: 10.1038/s41598-025-13697-7.

Accelerated prediction of molecular properties for per- and polyfluoroalkyl substances using graph neural networks with adjacency-free message passing.使用无邻接消息传递的图神经网络对全氟和多氟烷基物质的分子性质进行加速预测。

Environ Pollut. 2025 Jun 30;382:126705. doi: 10.1016/j.envpol.2025.126705.

Shoulder Arthrogram肩关节造影

Prescription of Controlled Substances: Benefits and Risks管制药品的处方：益处与风险

Hybrid protein-ligand binding residue prediction with protein language models: does the structure matter?利用蛋白质语言模型进行混合蛋白质-配体结合残基预测：结构重要吗？

Bioinformatics. 2025 Aug 2;41(8). doi: 10.1093/bioinformatics/btaf431.

An End-to-End Knowledge Graph Fused Graph Neural Network for Accurate Protein-Protein Interactions Prediction.一种用于准确预测蛋白质-蛋白质相互作用的端到端知识图谱融合图神经网络

IEEE/ACM Trans Comput Biol Bioinform. 2024 Nov-Dec;21(6):2518-2530. doi: 10.1109/TCBB.2024.3486216. Epub 2024 Dec 10.

Short-Term Memory Impairment短期记忆障碍

Toward generalizable prediction of antibody thermostability using machine learning on sequence and structure features.利用序列和结构特征的机器学习方法实现抗体热稳定性的可推广预测。

MAbs. 2023 Jan-Dec;15(1):2163584. doi: 10.1080/19420862.2022.2163584.

本文引用的文献

Simulating 500 million years of evolution with a language model.用语言模型模拟5亿年的进化历程。

Science. 2025 Feb 21;387(6736):850-858. doi: 10.1126/science.ads0018. Epub 2025 Jan 16.

Addressing the antibody germline bias and its effect on language models for improved antibody design.解决抗体种系偏倚及其对改善抗体设计的语言模型的影响。

Bioinformatics. 2024 Nov 1;40(11). doi: 10.1093/bioinformatics/btae618.

Structure-informed protein language models are robust predictors for variant effects.基于结构的蛋白质语言模型是变异效应的强大预测工具。

Hum Genet. 2025 Mar;144(2-3):209-225. doi: 10.1007/s00439-024-02695-w. Epub 2024 Aug 8.

Multimodal pretraining for unsupervised protein representation learning.用于无监督蛋白质表示学习的多模态预训练

Biol Methods Protoc. 2024 Jun 18;9(1):bpae043. doi: 10.1093/biomethods/bpae043. eCollection 2024.

Molecular surface descriptors to predict antibody developability: sensitivity to parameters, structure models, and conformational sampling.用于预测抗体可开发性的分子表面描述符：对参数、结构模型和构象采样的敏感性。

MAbs. 2024 Jan-Dec;16(1):2362788. doi: 10.1080/19420862.2024.2362788. Epub 2024 Jun 10.

AbMelt: Learning antibody thermostability from molecular dynamics.AbMelt：从分子动力学角度学习抗体热稳定性。

Biophys J. 2024 Sep 3;123(17):2921-2933. doi: 10.1016/j.bpj.2024.06.003. Epub 2024 Jun 7.

AbLEF: antibody language ensemble fusion for thermodynamically empowered property predictions.AbLEF：基于抗体语言集成融合的热力学赋能性质预测。

Bioinformatics. 2024 May 2;40(5). doi: 10.1093/bioinformatics/btae268.

Structure and function of therapeutic antibodies approved by the US FDA in 2023.2023年美国食品药品监督管理局批准的治疗性抗体的结构与功能

Antib Ther. 2024 Mar 19;7(2):132-156. doi: 10.1093/abt/tbae007. eCollection 2024 Apr.

Size exclusion chromatography of biopharmaceutical products: From current practices for proteins to emerging trends for viral vectors, nucleic acids and lipid nanoparticles.生物制药产品的排阻色谱法：从蛋白质的现行实践到病毒载体、核酸和脂质纳米粒的新兴趋势。

J Chromatogr A. 2024 May 10;1722:464862. doi: 10.1016/j.chroma.2024.464862. Epub 2024 Apr 1.

Predicting and Interpreting Protein Developability Via Transfer of Convolutional Sequence Representation.通过卷积序列表示的转移来预测和解释蛋白质可开发性。

ACS Synth Biol. 2023 Sep 15;12(9):2600-2615. doi: 10.1021/acssynbio.3c00196. Epub 2023 Aug 29.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

加速抗体开发：通过尺寸排阻色谱预测可开发性特性的基于序列和结构的模型

Accelerating antibody development: sequence and structure-based models for predicting developability properties via size exclusion chromatography.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献