Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China.
Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae374.
Metabolic processes can transform a drug into metabolites with different properties that may affect its efficacy and safety. Therefore, investigation of the metabolic fate of a drug candidate is of great significance for drug discovery. Computational methods have been developed to predict drug metabolites, but most of them suffer from two main obstacles: the lack of model generalization due to restrictions on metabolic transformation rules or specific enzyme families, and high rate of false-positive predictions. Here, we presented MetaPredictor, a rule-free, end-to-end and prompt-based method to predict possible human metabolites of small molecules including drugs as a sequence translation problem. We innovatively introduced prompt engineering into deep language models to enrich domain knowledge and guide decision-making. The results showed that using prompts that specify the sites of metabolism (SoMs) can steer the model to propose more accurate metabolite predictions, achieving a 30.4% increase in recall and a 16.8% reduction in false positives over the baseline model. The transfer learning strategy was also utilized to tackle the limited availability of metabolic data. For the adaptation to automatic or non-expert prediction, MetaPredictor was designed as a two-stage schema consisting of automatic identification of SoMs followed by metabolite prediction. Compared to four available drug metabolite prediction tools, our method showed comparable performance on the major enzyme families and better generalization that could additionally identify metabolites catalyzed by less common enzymes. The results indicated that MetaPredictor could provide a more comprehensive and accurate prediction of drug metabolism through the effective combination of transfer learning and prompt-based learning strategies.
代谢过程可以将药物转化为具有不同性质的代谢物,这些代谢物可能会影响其疗效和安全性。因此,研究候选药物的代谢命运对于药物发现非常重要。已经开发出计算方法来预测药物代谢物,但它们大多数都受到两个主要障碍的困扰:由于代谢转化规则或特定酶家族的限制,模型概括能力不足,以及假阳性预测率高。在这里,我们提出了 MetaPredictor,这是一种无规则、端到端、基于提示的方法,可将小分子(包括药物)的可能人类代谢物预测作为序列翻译问题。我们创新性地将提示工程引入到深度语言模型中,以丰富领域知识并指导决策。结果表明,使用指定代谢部位(SoM)的提示可以引导模型提出更准确的代谢物预测,与基线模型相比,召回率提高了 30.4%,假阳性率降低了 16.8%。还利用迁移学习策略来解决代谢数据有限的问题。为了适应自动或非专家预测,MetaPredictor 被设计为两阶段方案,包括自动识别 SoM 后进行代谢物预测。与四个现有的药物代谢物预测工具相比,我们的方法在主要酶家族上表现出相当的性能,并且具有更好的泛化能力,还可以识别出由不太常见的酶催化的代谢物。结果表明,MetaPredictor 可以通过有效结合迁移学习和基于提示的学习策略,提供更全面、准确的药物代谢预测。