Pharmaceutical Technology Division, Nichi-Iko Pharmaceutical Co., Ltd., 205-1, Shimoumezawa, Namerikawa-shi, Toyama 936-0857, Japan; Department of Pharmaceutical Technology, Graduate School of Medicine and Pharmaceutical Science for Research, University of Toyama, 2630 Sugitani, Toyama-shi, Toyama 930-0194, Japan.
Department of Pharmaceutical Technology, Graduate School of Medicine and Pharmaceutical Science for Research, University of Toyama, 2630 Sugitani, Toyama-shi, Toyama 930-0194, Japan.
Int J Pharm. 2021 Nov 20;609:121158. doi: 10.1016/j.ijpharm.2021.121158. Epub 2021 Oct 6.
This study investigates the usefulness of machine learning for modeling complex relationships in a material library. We tested 81 types of active pharmaceutical ingredients (APIs) and their tablets to construct the library, which included the following variables: 20 types of API material properties, one type of process parameter (three levels of compression pressure), and two types of tablet properties (tensile strength (TS) and disintegration time (DT)). The machine learning algorithms boosted tree (BT) and random forest (RF) were applied to analysis of our material library to model the relationships between input variables (material properties and compression pressure) and output variables (TS and DT). The calculated BT and RF models achieved higher performance statistics compared with a conventional modeling method (i.e., partial least squares regression), and revealed the material properties that strongly influence TS and DT. For TS, true density, the tenth percentile of the cumulative percentage size distribution, loss on drying, and compression pressure were of high relative importance. For DT, total surface energy, water absorption rate, polar surface energy, and hygroscopicity had significant effects. Thus, we demonstrate that BT and RF can be used to model complex relationships and clarify important material properties in a material library.
本研究旨在探讨机器学习在材料库中建模复杂关系的有效性。我们测试了 81 种活性药物成分(APIs)及其片剂来构建库,其中包括以下变量:20 种 API 材料特性、一种工艺参数(压缩压力分为三个级别)以及两种片剂特性(拉伸强度(TS)和崩解时间(DT))。我们将提升树(BT)和随机森林(RF)这两种机器学习算法应用于材料库分析,以构建输入变量(材料特性和压缩压力)与输出变量(TS 和 DT)之间的关系模型。与传统建模方法(即偏最小二乘回归)相比,计算出的 BT 和 RF 模型具有更高的性能统计数据,并揭示了对 TS 和 DT 有重要影响的材料特性。对于 TS,真密度、累积百分比粒径分布的第十个百分位数、干燥失重和压缩压力具有较高的相对重要性。对于 DT,总表面能、吸水率、极性表面能和吸湿性具有显著影响。因此,我们证明 BT 和 RF 可用于建模复杂关系,并阐明材料库中的重要材料特性。