Suppr超能文献

利用多任务深度神经网络和对比分子解释进行准确的临床毒性预测。

Accurate clinical toxicity prediction using multi-task deep neural nets and contrastive molecular explanations.

机构信息

Chemical and Biological Engineering, RPI, Troy, NY, USA.

IBM Research, Yorktown Heights, NY, USA.

出版信息

Sci Rep. 2023 Mar 25;13(1):4908. doi: 10.1038/s41598-023-31169-8.

Abstract

Explainable machine learning for molecular toxicity prediction is a promising approach for efficient drug development and chemical safety. A predictive ML model of toxicity can reduce experimental cost and time while mitigating ethical concerns by significantly reducing animal and clinical testing. Herein, we use a deep learning framework for simultaneously modeling in vitro, in vivo, and clinical toxicity data. Two different molecular input representations are used; Morgan fingerprints and pre-trained SMILES embeddings. A multi-task deep learning model accurately predicts toxicity for all endpoints, including clinical, as indicated by the area under the Receiver Operator Characteristic curve and balanced accuracy. In particular, pre-trained molecular SMILES embeddings as input to the multi-task model improved clinical toxicity predictions compared to existing models in MoleculeNet benchmark. Additionally, our multitask approach is comprehensive in the sense that it is comparable to state-of-the-art approaches for specific endpoints in in vitro, in vivo and clinical platforms. Through both the multi-task model and transfer learning, we were able to indicate the minimal need of in vivo data for clinical toxicity predictions. To provide confidence and explain the model's predictions, we adapt a post-hoc contrastive explanation method that returns pertinent positive and negative features, which correspond well to known mutagenic and reactive toxicophores, such as unsubstituted bonded heteroatoms, aromatic amines, and Michael receptors. Furthermore, toxicophore recovery by pertinent feature analysis captures more of the in vitro (53%) and in vivo (56%), rather than of the clinical (8%), endpoints, and indeed uncovers a preference in known toxicophore data towards in vitro and in vivo experimental data. To our knowledge, this is the first contrastive explanation, using both present and absent substructures, for predictions of clinical and in vivo molecular toxicity.

摘要

用于分子毒性预测的可解释机器学习是一种很有前途的方法,可以高效地进行药物开发和化学安全性评估。毒性预测性机器学习模型可以通过大大减少动物和临床试验来降低实验成本和时间,同时减轻伦理问题。在此,我们使用深度学习框架同时对体外、体内和临床毒性数据进行建模。使用了两种不同的分子输入表示形式;Morgan 指纹和预先训练的 SMILES 嵌入。多任务深度学习模型准确地预测了所有终点的毒性,包括临床终点,如接收器操作特征曲线下面积和平衡准确性所指示的。特别是,将预先训练的分子 SMILES 嵌入作为多任务模型的输入,与 MoleculeNet 基准中的现有模型相比,提高了临床毒性预测的准确性。此外,我们的多任务方法是全面的,因为它与体外、体内和临床平台上特定终点的最新方法相当。通过多任务模型和迁移学习,我们能够指出对于临床毒性预测,体内数据的最小需求。为了提供置信度并解释模型的预测,我们采用了一种事后对比解释方法,该方法返回相关的阳性和阴性特征,这些特征与已知的致突变和反应性毒性基团很好地对应,例如未取代的键合杂原子、芳香胺和 Michael 受体。此外,通过相关特征分析进行毒性基团回收可以捕获更多的体外(53%)和体内(56%)终点,而不是临床(8%)终点,并且确实揭示了已知毒性基团数据对体外和体内实验数据的偏好。据我们所知,这是第一个使用现有和不存在的亚结构进行临床和体内分子毒性预测的对比解释。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dba/10039880/31db4a2f350a/41598_2023_31169_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验