Suppr超能文献

加速抗体开发:通过尺寸排阻色谱预测可开发性特性的基于序列和结构的模型

Accelerating antibody development: sequence and structure-based models for predicting developability properties via size exclusion chromatography.

作者信息

Abeer A N M Nafiz, Boroumand Mehdi, Sermadiras Isabelle, Caldwell Jenna G, Stanev Valentin, Mody Neil, Kaplan Gilad, Savery James, Croasdale-Wood Rebecca, Pouryahya Maryam

机构信息

Data Science and Modelling, BioPharmaceuticals R&D, AstraZeneca, Gaithersburg, MD, USA.

Department of Electrical and Computer Engineering, Texas A&M University, R&D, AstraZeneca, College Station, TX, USA.

出版信息

MAbs. 2025 Dec;17(1):2562997. doi: 10.1080/19420862.2025.2562997. Epub 2025 Sep 26.

Abstract

Experimental screening for biopharmaceutical developability properties typically relies on resource-intensive, and time-consuming assays such as size exclusion chromatography (SEC). This study highlights the potential of in silico models to accelerate the screening process by exploring sequence and structure-based machine learning techniques. Specifically, we compared surrogate models based on pre-computed features extracted from sequence and predicted structure with sequence-based approaches using protein language models (PLMs) like ESM-2. In addition to different end-to-end fine-tuning strategies for PLM, we have also investigated the integration of the structural information of the antibodies into the prediction pipeline through graph neural networks (GNN). We applied these different methods for predicting protein aggregation propensity using a dataset of approximately 1200 Immunoglobulin G (IgG1) molecules. Through this empirical evaluation, our study identifies the most effective in silico approach for predicting developability properties for SEC assays, thereby adding insights to existing screening efforts for accelerating the antibody development process.

摘要

生物制药可开发性特性的实验筛选通常依赖于资源密集型且耗时的分析方法,如尺寸排阻色谱法(SEC)。本研究通过探索基于序列和结构的机器学习技术,突出了计算机模拟模型在加速筛选过程方面的潜力。具体而言,我们将基于从序列和预测结构中提取的预计算特征的替代模型与使用诸如ESM-2等蛋白质语言模型(PLM)的基于序列的方法进行了比较。除了针对PLM的不同端到端微调策略外,我们还研究了通过图神经网络(GNN)将抗体的结构信息整合到预测流程中。我们使用一个包含约1200个免疫球蛋白G(IgG1)分子的数据集,应用这些不同方法预测蛋白质聚集倾向。通过这一实证评估,我们的研究确定了预测SEC分析可开发性特性最有效的计算机模拟方法,从而为加速抗体开发过程的现有筛选工作提供了见解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6828/12477876/9742833c593e/KMAB_A_2562997_F0001_OC.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验