Clinical Pharmacology and Pharmacometrics, Graduate School of Pharmaceutical Sciences, Chiba University, 1-8-1, Inohana, Chuo-ku, Chiba-shi, Chiba, 260-8675, Japan.
Toxicology & Pharmacokinetics Research, Central Research Laboratories, Zeria Pharmaceutical Co., Ltd, 2512-1 Numagami, Oshikiri, Kumagaya-shi, Saitama, 360-0111, Japan.
AAPS J. 2021 Dec 10;24(1):10. doi: 10.1208/s12248-021-00664-z.
In this study, observed food effects of 473 drugs were categorized into positive, negative, or no effects and compared with the predictions made by machine learning (ML), the Biopharmaceutics Classification System (BCS) and refined Developability Classification System (rDCS). All methods used primarily in silico estimates for prediction, and for ML, four algorithms were evaluated using nested cross-validation to select important information from 371 features calculated based on the chemical structure. Approximately 18 features, including estimated solubility in biorelevant media, were selected as important, and the random forest classifier was the best among four algorithms with 36.6% error rate (ER) and 10.8% opposite prediction rate (OPR). The prediction by rDCS utilizing solubility in a biorelevant medium was somewhat inferior, but not by much; 41.0% ER and 11.4% OPR. Compared with these two methods, the prediction by BCS was inferior; 54.5% ER and 21.4% OPR. ER was improved modestly by using measured features instead of in silico estimates when BCS was applied to a subset of 151 drugs (46.4% from 55.0%). ML and rDCS predicted the food effects of the same subset using in silico estimates with ERs of 37.7% and 42.4%, respectively, suggesting that the predictions by ML and rDCS using in silico features are similar or more accurate than those by BCS using measured features. These results suggest that ML was useful in revealing essential features from complex information and, together with rDCS, is effective in predicting food effects during drug development, including early drug discovery.
在这项研究中,观察到的 473 种药物的食物效应被分为阳性、阴性或无效应,并与机器学习 (ML)、生物药剂学分类系统 (BCS) 和改良开发分类系统 (rDCS) 的预测进行了比较。所有方法主要使用基于计算机的估算值进行预测,对于 ML,使用嵌套交叉验证评估了四种算法,以从基于化学结构计算的 371 个特征中选择重要信息。大约 18 个特征,包括在生物相关介质中的估计溶解度,被选为重要特征,随机森林分类器是四种算法中最好的,错误率 (ER) 为 36.6%,相反预测率 (OPR) 为 10.8%。利用生物相关介质中的溶解度进行 rDCS 的预测稍差,但并不差很多;ER 为 41.0%,OPR 为 11.4%。与这两种方法相比,BCS 的预测较差;ER 为 54.5%,OPR 为 21.4%。当将 BCS 应用于 151 种药物的子集(46.4%来自 55.0%)时,使用测量特征而不是基于计算机的估算值可以适度提高 ER。ML 和 rDCS 使用基于计算机的估算值对相同的子集进行了预测,ER 分别为 37.7%和 42.4%,这表明 ML 和 rDCS 使用基于计算机的特征进行预测与 BCS 使用测量特征进行预测相似或更准确。这些结果表明,ML 有助于从复杂信息中揭示基本特征,并且与 rDCS 一起,在药物开发期间,包括早期药物发现,预测食物效应是有效的。