Suppr超能文献

用于预测药物退出市场的预训练转换器模型。

Pretrained transformer models for predicting the withdrawal of drugs from the market.

机构信息

Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, P.O.B. 653, Beer-Sheva, 8410501, Israel.

出版信息

Bioinformatics. 2023 Aug 1;39(8). doi: 10.1093/bioinformatics/btad519.

Abstract

MOTIVATION

The process of drug discovery is notoriously complex, costing an average of 2.6 billion dollars and taking ∼13 years to bring a new drug to the market. The success rate for new drugs is alarmingly low (around 0.0001%), and severe adverse drug reactions (ADRs) frequently occur, some of which may even result in death. Early identification of potential ADRs is critical to improve the efficiency and safety of the drug development process.

RESULTS

In this study, we employed pretrained large language models (LLMs) to predict the likelihood of a drug being withdrawn from the market due to safety concerns. Our method achieved an area under the curve (AUC) of over 0.75 through cross-database validation, outperforming classical machine learning models and graph-based models. Notably, our pretrained LLMs successfully identified over 50% drugs that were subsequently withdrawn, when predictions were made on a subset of drugs with inconsistent labeling between the training and test sets.

AVAILABILITY AND IMPLEMENTATION

The code and datasets are available at https://github.com/eyalmazuz/DrugWithdrawn.

摘要

动机

药物发现的过程非常复杂,平均成本为 26 亿美元,需要大约 13 年的时间才能将一种新药推向市场。新药的成功率低得惊人(约为 0.0001%),而且经常会出现严重的药物不良反应(ADR),其中一些甚至可能导致死亡。早期识别潜在的 ADR 对于提高药物开发过程的效率和安全性至关重要。

结果

在这项研究中,我们使用预训练的大型语言模型(LLMs)来预测药物因安全问题而退出市场的可能性。我们的方法通过跨数据库验证实现了超过 0.75 的曲线下面积(AUC),优于经典的机器学习模型和基于图的模型。值得注意的是,当在训练集和测试集之间标签不一致的药物子集上进行预测时,我们的预训练 LLM 成功识别出超过 50%的随后被撤出的药物。

可用性和实施

代码和数据集可在 https://github.com/eyalmazuz/DrugWithdrawn 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2558/10469107/ec2c75ffd173/btad519f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验