Department of Pharmaceutical Sciences, University of Michigan, Ann Arbor, MI, USA.
Biointerfaces Institute, University of Michigan, Ann Arbor, MI, USA.
MAbs. 2024 Jan-Dec;16(1):2303781. doi: 10.1080/19420862.2024.2303781. Epub 2024 Mar 12.
Early identification of antibody candidates with drug-like properties is essential for simplifying the development of safe and effective antibody therapeutics. For subcutaneous administration, it is important to identify candidates with low self-association to enable their formulation at high concentration while maintaining low viscosity, opalescence, and aggregation. Here, we report an interpretable machine learning model for predicting antibody (IgG1) variants with low viscosity using only the sequences of their variable (Fv) regions. Our model was trained on antibody viscosity data (>100 mg/mL mAb concentration) obtained at a common formulation pH (pH 5.2), and it identifies three key Fv features of antibodies linked to viscosity, namely their isoelectric points, hydrophobic patch sizes, and numbers of negatively charged patches. Of the three features, most predicted antibodies at risk for high viscosity, including antibodies with diverse antibody germlines in our study (79 mAbs) as well as clinical-stage IgG1s (94 mAbs), are those with low Fv isoelectric points (Fv pIs < 6.3). Our model identifies viscous antibodies with relatively high accuracy not only in our training and test sets, but also for previously reported data. Importantly, we show that the interpretable nature of the model enables the design of mutations that significantly reduce antibody viscosity, which we confirmed experimentally. We expect that this approach can be readily integrated into the drug development process to reduce the need for experimental viscosity screening and improve the identification of antibody candidates with drug-like properties.
早期识别具有类药性的抗体候选物对于简化安全有效的抗体治疗药物的开发至关重要。对于皮下给药,识别具有低自缔合特性的候选物很重要,这可以使其在高浓度下形成制剂,同时保持低粘度、低光学不透明度和低聚集性。在这里,我们报告了一种可解释的机器学习模型,用于仅使用其可变(Fv)区的序列预测具有低粘度的抗体(IgG1)变体。我们的模型是在常见制剂 pH(pH 5.2)下获得的抗体粘度数据(>100mg/mL mAb 浓度)上进行训练的,它确定了与粘度相关的三个关键 Fv 特征,即它们的等电点、疏水性补丁大小和带负电荷的补丁数量。在这三个特征中,大多数预测的高粘度风险抗体,包括我们研究中具有多样化抗体胚系的抗体(79 种 mAbs)以及临床阶段的 IgG1(94 种 mAbs),都是那些 Fv 等电点较低(Fv pI < 6.3)的抗体。我们的模型不仅在我们的训练集和测试集中,而且在以前报道的数据中,都能以相对较高的准确度识别粘性抗体。重要的是,我们表明模型的可解释性质能够设计出显著降低抗体粘度的突变,我们通过实验证实了这一点。我们预计这种方法可以很容易地整合到药物开发过程中,以减少对实验性粘度筛选的需求,并提高具有类药性的抗体候选物的识别能力。