Suppr超能文献

基于机器学习的抗体设计无约束尺度的计算机原理证明。

In silico proof of principle of machine learning-based antibody design at unconstrained scale.

机构信息

Department of Immunology, Oslo University Hospital Rikshospitalet and University of Oslo, Norway.

Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland.

出版信息

MAbs. 2022 Jan-Dec;14(1):2031482. doi: 10.1080/19420862.2022.2031482.

Abstract

Generative machine learning (ML) has been postulated to become a major driver in the computational design of antigen-specific monoclonal antibodies (mAb). However, efforts to confirm this hypothesis have been hindered by the infeasibility of testing arbitrarily large numbers of antibody sequences for their most critical design parameters: paratope, epitope, affinity, and developability. To address this challenge, we leveraged a lattice-based antibody-antigen binding simulation framework, which incorporates a wide range of physiological antibody-binding parameters. The simulation framework enables the computation of synthetic antibody-antigen 3D-structures, and it functions as an oracle for unrestricted prospective evaluation and benchmarking of antibody design parameters of ML-generated antibody sequences. We found that a deep generative model, trained exclusively on antibody sequence (one dimensional: 1D) data can be used to design conformational (three dimensional: 3D) epitope-specific antibodies, matching, or exceeding the training dataset in affinity and developability parameter value variety. Furthermore, we established a lower threshold of sequence diversity necessary for high-accuracy generative antibody ML and demonstrated that this lower threshold also holds on experimental real-world data. Finally, we show that transfer learning enables the generation of high-affinity antibody sequences from low-N training data. Our work establishes a priori feasibility and the theoretical foundation of high-throughput ML-based mAb design.

摘要

生成式机器学习(ML)被认为是抗原特异性单克隆抗体(mAb)计算设计的主要驱动力。然而,由于无法测试任意数量的抗体序列的最关键设计参数:表位、抗原结合亲和力和可开发性,验证这一假设的努力受到了阻碍。为了解决这一挑战,我们利用了基于格点的抗体-抗原结合模拟框架,该框架结合了广泛的生理抗体结合参数。该模拟框架能够计算合成的抗体-抗原 3D 结构,并作为不受限制的前瞻性评估和基准测试 ML 生成的抗体序列的抗体设计参数的工具。我们发现,仅基于抗体序列(一维:1D)数据训练的深度生成模型可用于设计构象(三维:3D)表位特异性抗体,其亲和力和可开发性参数值的多样性与训练数据集相匹配或超过。此外,我们确定了高精度生成性抗体 ML 所需的序列多样性的下限,并证明该下限在实验真实世界数据上同样适用。最后,我们表明,迁移学习能够从低 N 训练数据生成高亲和力的抗体序列。我们的工作建立了基于 ML 的高通量 mAb 设计的先验可行性和理论基础。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ee3c/8986205/b9bf2b5c2412/KMAB_A_2031482_F0001_OC.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验