Modeling and Informatics, Merck & Co., Inc., South San Francisco, California.
Modeling and Informatics, Merck & Co., Inc., South San Francisco, California.
Biophys J. 2024 Sep 3;123(17):2921-2933. doi: 10.1016/j.bpj.2024.06.003. Epub 2024 Jun 7.
Antibody thermostability is challenging to predict from sequence and/or structure. This difficulty is likely due to the absence of direct entropic information. Herein, we present AbMelt where we model the inherent flexibility of homologous antibody structures using molecular dynamics simulations at three temperatures and learn the relevant descriptors to predict the temperatures of aggregation (T), melt onset (T), and melt (T). We observed that the radius of gyration deviation of the complementarity determining regions at 400 K is the highest Pearson correlated descriptor with aggregation temperature (r = -0.68 ± 0.23) and the deviation of internal molecular contacts at 350 K is the highest correlated descriptor with both T (r = -0.74 ± 0.04) as well as T (r = -0.69 ± 0.03). Moreover, after descriptor selection and machine learning regression, we predict on a held-out test set containing both internal and public data and achieve robust performance for all endpoints compared with baseline models (T R = 0.57 ± 0.11, T R = 0.56 ± 0.01, and T R = 0.60 ± 0.06). In addition, the robustness of the AbMelt molecular dynamics methodology is demonstrated by only training on <5% of the data and outperforming more traditional machine learning models trained on the entire data set of more than 500 internal antibodies. Users can predict thermostability measurements for antibody variable fragments by collecting descriptors and using AbMelt, which has been made available.
抗体热稳定性难以从序列和/或结构上进行预测。这种困难可能是由于缺乏直接的熵信息。在此,我们提出了 AbMelt,我们使用分子动力学模拟在三个温度下对同源抗体结构的固有灵活性进行建模,并学习相关描述符来预测聚集温度 (T)、起始熔化温度 (T) 和熔化温度 (T)。我们观察到,在 400 K 时互补决定区的回转半径偏差是与聚集温度相关性最高的 Pearson 描述符(r = -0.68 ± 0.23),而在 350 K 时内部分子接触的偏差是与 T 相关性最高的描述符(r = -0.74 ± 0.04)以及 T(r = -0.69 ± 0.03)。此外,在进行描述符选择和机器学习回归后,我们在包含内部和公共数据的独立测试集中进行预测,并与基线模型相比,在所有终点都实现了稳健的性能(T R= 0.57 ± 0.11、T R= 0.56 ± 0.01 和 T R= 0.60 ± 0.06)。此外,AbMelt 分子动力学方法的稳健性仅通过在 <5%的数据上进行训练并优于在包含 500 多个内部抗体的整个数据集上进行训练的更传统的机器学习模型来证明。用户可以通过收集描述符并使用 AbMelt 来预测抗体可变片段的热稳定性测量值,该方法已经公开。