Jung Ye Lim, Yoo Hyoung Sun, Hwang JeeNa
Division of Data Analysis, Korea Institute of Science and Technology Information (KISTI), Seoul 02456, Republic of Korea.
Science and Technology Management Policy, University of Science and Technology, Daejeon 34113, Republic of Korea.
Expert Syst Appl. 2022 Jul 15;198:116825. doi: 10.1016/j.eswa.2022.116825. Epub 2022 Mar 8.
New drug development guarantees a very high return on success, but the success rate is extremely low. Pharmaceutical companies have attempted to use various strategies to increase the success rate of drug development, but this goal has been difficult to achieve. In this study, we developed a model that can guide effective decision-making at the planning stage of new drug development by leveraging machine learning. The Drug Development Recommendation (DDR) model, we present here, is a hybrid model for recommending and/or predicting drug groups suitable for development by individual pharmaceutical companies. It combines association rule learning, collaborative filtering, and content-based filtering approaches for enterprise-customized recommendations. In the case of content-based filtering applying a random forest classification algorithm, the accuracy and area under curve were 78% and 0.74, respectively. In particular, the DDR model was applied to predict the success probability of companies developing Coronavirus disease 2019 (COVID-19) vaccines. It was demonstrated that the higher the predicted score from the DDR model, the more progress in the clinical phase of the COVID-19 vaccine development. Although our approach has limitations that should be improved, it makes scientific as well as industrial contributions in that the DDR model can support rational decision-making prior to initiating drug development by considering not only technical aspects but also company-related variables.
新药研发若取得成功便能保证极高的回报率,但成功率却极低。制药公司已尝试运用各种策略来提高药物研发的成功率,但这一目标一直难以实现。在本研究中,我们开发了一种模型,该模型可通过利用机器学习在新药研发的规划阶段指导有效决策。我们在此介绍的药物研发推荐(DDR)模型,是一种用于为各制药公司推荐和/或预测适合研发的药物组的混合模型。它结合了关联规则学习、协同过滤和基于内容的过滤方法来进行企业定制推荐。在应用随机森林分类算法的基于内容的过滤情况下,准确率和曲线下面积分别为78%和0.74。特别是,DDR模型被用于预测公司研发2019冠状病毒病(COVID-19)疫苗的成功概率。结果表明,DDR模型的预测得分越高,COVID-19疫苗研发临床阶段的进展就越大。尽管我们的方法存在有待改进的局限性,但它在科学和产业方面都有贡献,因为DDR模型不仅能从技术层面,还能通过考虑与公司相关的变量,在启动药物研发之前支持合理决策。