Lu Tianshu, Wu Yiyang, Xiong Ping, Zhong Hao, Ding Yang, Li Haifeng, Ouyang Defang
Institute of Applied Physics and Materials Engineering, University of Macau, Macau, 999078, China.
State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences, University of Macau, Macau, 999078, China.
Pharm Res. 2025 Apr;42(4):697-709. doi: 10.1007/s11095-025-03853-z. Epub 2025 Apr 3.
Amorphous solid dispersion (ASD) is widely utilized to enhance the solubility and bioavailability of water-insoluble drugs. However, conventional experimental approaches for ASD development are often resource-intensive and time-consuming. Machine learning (ML) algorithms have great potential to predict ASD formulations but face the challenge of extensive data to construct reliable models. Current study aims to predict the formation of both binary and ternary ASD by combined high-throughput screening (HTS) and ML approaches.
Micro-quantity HTS was conducted to generate 1272 binary and ternary solid dispersions using solvent evaporation method. The Powder X-Ray Diffraction (PXRD) was used to characterize the amorphous state of formulations. The results indicated that 188 formulations successfully formed amorphous solid dispersions (ASDs), while 1084 resulted in crystalline formations. Models development employed nested cross-validation with four algorithms: Light Gradient Boosting Machine (LGBM), Random Forest (RF), Support Vector Machine (SVM), and Multi-Layer Perceptron (MLP).
The RF model for ASD formation achieved 96.7% accuracy on the in-house HTS dataset, with a precision of approximately 87.9% and an F1 score of 83.6%. Furthermore, the RF model trained with milligram-scale HTS experimental data could effectively predict the large-scale ASD formulations from the literature, highlighting its promise as a powerful tool for advancing ASD prediction.
In summary, the combination of HTS experiments and ML techniques provides a valuable reference framework for ASD development, greatly minimizing both time and material usage in the selection of formulations during the early stages of drug discovery with a limited quantity of API.
无定形固体分散体(ASD)被广泛用于提高水不溶性药物的溶解度和生物利用度。然而,传统的ASD开发实验方法通常资源消耗大且耗时。机器学习(ML)算法在预测ASD制剂方面具有巨大潜力,但面临着构建可靠模型所需大量数据的挑战。当前研究旨在通过高通量筛选(HTS)和ML方法相结合来预测二元和三元ASD的形成。
采用微量HTS,通过溶剂蒸发法制备了1272种二元和三元固体分散体。用粉末X射线衍射(PXRD)表征制剂的无定形状态。结果表明,188种制剂成功形成了无定形固体分散体(ASD),而1084种形成了晶体结构。模型开发采用了四种算法的嵌套交叉验证:轻梯度提升机(LGBM)、随机森林(RF)、支持向量机(SVM)和多层感知器(MLP)。
ASD形成的RF模型在内部HTS数据集上的准确率达到96.7%,精确率约为87.9%,F1分数为83.6%。此外,用毫克级HTS实验数据训练的RF模型能够有效预测文献中的大规模ASD制剂,突出了其作为推进ASD预测的有力工具的前景。
总之,HTS实验和ML技术的结合为ASD开发提供了一个有价值的参考框架,在药物发现早期使用有限量原料药选择制剂时,极大地减少了时间和材料的使用。