Suppr超能文献

CMMS-GCL:基于图对比学习的跨模态代谢稳定性预测。

CMMS-GCL: cross-modality metabolic stability prediction with graph contrastive learning.

机构信息

School of Life Sciences, Northwestern Polytechnical University, Xi'an 710072, China.

Institute for Infocomm Research (I2R), Agency for Science, Technology and Research (A*STAR), Singapore 138632, Singapore.

出版信息

Bioinformatics. 2023 Aug 1;39(8). doi: 10.1093/bioinformatics/btad503.

Abstract

MOTIVATION

Metabolic stability plays a crucial role in the early stages of drug discovery and development. Accurately modeling and predicting molecular metabolic stability has great potential for the efficient screening of drug candidates as well as the optimization of lead compounds. Considering wet-lab experiment is time-consuming, laborious, and expensive, in silico prediction of metabolic stability is an alternative choice. However, few computational methods have been developed to address this task. In addition, it remains a significant challenge to explain key functional groups determining metabolic stability.

RESULTS

To address these issues, we develop a novel cross-modality graph contrastive learning model named CMMS-GCL for predicting the metabolic stability of drug candidates. In our framework, we design deep learning methods to extract features for molecules from two modality data, i.e. SMILES sequence and molecule graph. In particular, for the sequence data, we design a multihead attention BiGRU-based encoder to preserve the context of symbols to learn sequence representations of molecules. For the graph data, we propose a graph contrastive learning-based encoder to learn structure representations by effectively capturing the consistencies between local and global structures. We further exploit fully connected neural networks to combine the sequence and structure representations for model training. Extensive experimental results on two datasets demonstrate that our CMMS-GCL consistently outperforms seven state-of-the-art methods. Furthermore, a collection of case studies on sequence data and statistical analyses of the graph structure module strengthens the validation of the interpretability of crucial functional groups recognized by CMMS-GCL. Overall, CMMS-GCL can serve as an effective and interpretable tool for predicting metabolic stability, identifying critical functional groups, and thus facilitating the drug discovery process and lead compound optimization.

AVAILABILITY AND IMPLEMENTATION

The code and data underlying this article are freely available at https://github.com/dubingxue/CMMS-GCL.

摘要

动机

代谢稳定性在药物发现和开发的早期阶段起着至关重要的作用。准确地模拟和预测分子代谢稳定性对于高效筛选药物候选物以及优化先导化合物具有很大的潜力。考虑到湿实验室实验既耗时、费力又昂贵,因此,基于计算机的代谢稳定性预测是一种替代选择。然而,目前还没有开发出多少计算方法来解决这个问题。此外,解释决定代谢稳定性的关键功能基团仍然是一个重大挑战。

结果

为了解决这些问题,我们开发了一种名为 CMMS-GCL 的新型跨模态图对比学习模型,用于预测候选药物的代谢稳定性。在我们的框架中,我们设计了深度学习方法从两种模态数据(即 SMILES 序列和分子图)中为分子提取特征。特别是对于序列数据,我们设计了一个多头注意力 BiGRU 基编码器来保留符号的上下文,以学习分子的序列表示。对于图数据,我们提出了一种基于图对比学习的编码器,通过有效地捕捉局部和全局结构之间的一致性来学习结构表示。我们进一步利用全连接神经网络来结合序列和结构表示进行模型训练。在两个数据集上的广泛实验结果表明,我们的 CMMS-GCL 始终优于七种最先进的方法。此外,对序列数据的案例研究和对图结构模块的统计分析加强了对 CMMS-GCL 识别的关键功能基团的可解释性的验证。总体而言,CMMS-GCL 可以作为一种有效且可解释的工具,用于预测代谢稳定性、识别关键功能基团,从而促进药物发现过程和先导化合物优化。

可用性和实现

本文所依据的代码和数据可在 https://github.com/dubingxue/CMMS-GCL 上免费获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/295e/10457661/72854c1893fa/btad503f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验