Department of Chemistry, Purdue University, West Lafayette, IN, USA.
Analytical Research & Development, MRL, Merck & Co., Inc., Rahway, NJ, USA.
Pharm Res. 2023 Mar;40(3):701-710. doi: 10.1007/s11095-023-03475-3. Epub 2023 Feb 16.
Chemical and physical stabilities are two key features considered in pharmaceutical development. Chemical stability is typically reported as a combination of potency and degradation product. Moreover, fluorescent reporter Thioflavin-T is commonly used to measure physical stability. Executing stability studies is a lengthy process and requires extensive resources. To reduce the resources and shorten the process for stability studies during the development of a drug product, we introduce a machine learning-based model for predicting the chemical stability over time using both formulation conditions as well as aggregation curves.
In this work, we develop the relationships between the formulation, stability timepoint, and the chemical stability measurements and evaluated the performance on a random test set. We have developed a multilayer perceptron (MLP) for total degradation prediction and a random forest (RF) model for potency.
The coefficient of determination (R) of 0.945 and a mean absolute error (MAE) of 0.421 were achieved on the test set when using MLP for total degradation. Similarly, we achieved a R of 0.908 and MAE of 1.435 when predicting potency using the RF model. When physical stability measurements are included into the MLP model, the MAE of predicting TD decreases to 0.148. Using a similar strategy for potency prediction, the MAE decreases to 0.705 for the RF model.
We conclude two important points: first, chemical stability can be modeled using machine learning techniques and second there is a relationship between the physical stability of a peptide and its chemical stability.
化学和物理稳定性是药物开发中两个关键特征。化学稳定性通常报告为效力和降解产物的组合。此外,荧光报告物硫黄素 T 常用于测量物理稳定性。执行稳定性研究是一个漫长的过程,需要大量资源。为了减少药物产品开发过程中稳定性研究的资源并缩短其过程,我们引入了一种基于机器学习的模型,该模型可使用制剂条件和聚集曲线来预测随时间的化学稳定性。
在这项工作中,我们开发了制剂、稳定性时间点和化学稳定性测量之间的关系,并在随机测试集中评估了性能。我们已经开发了用于总降解预测的多层感知器 (MLP) 和用于效力的随机森林 (RF) 模型。
当使用 MLP 进行总降解预测时,测试集的决定系数 (R) 为 0.945,平均绝对误差 (MAE) 为 0.421。同样,当使用 RF 模型预测效力时,我们获得了 R 为 0.908 和 MAE 为 1.435。当将物理稳定性测量值纳入 MLP 模型时,预测 TD 的 MAE 降低至 0.148。对于效力预测,使用类似的策略,RF 模型的 MAE 降低至 0.705。
我们得出两个重要结论:首先,可以使用机器学习技术对化学稳定性进行建模;其次,肽的物理稳定性与其化学稳定性之间存在关系。