Suppr超能文献

FAME:用于表型药物发现的基于片段的条件分子生成

FAME: Fragment-based Conditional Molecular Generation for Phenotypic Drug Discovery.

作者信息

Pham Thai-Hoang, Xie Lei, Zhang Ping

机构信息

Department of Computer Science and Engineering, The Ohio State University, Columbus, USA.

Department of Computer Science, Hunter College, The City University of New York, New York City, USA; Neuroscience, Weill Cornell Medicine, New York City, USA.

出版信息

Proc SIAM Int Conf Data Min. 2022;2022:720-728. doi: 10.1137/1.9781611977172.81.

Abstract

molecular design is a key challenge in drug discovery due to the complexity of chemical space. With the availability of molecular datasets and advances in machine learning, many deep generative models are proposed for generating novel molecules with desired properties. However, most of the existing models focus only on molecular distribution learning and target-based molecular design, thereby hindering their potentials in real-world applications. In drug discovery, phenotypic molecular design has advantages over target-based molecular design, especially in first-in-class drug discovery. In this work, we propose the first deep graph generative model (FAME) targeting phenotypic molecular design, in particular gene expression-based molecular design. FAME leverages a conditional variational autoencoder framework to learn the conditional distribution generating molecules from gene expression profiles. However, this distribution is difficult to learn due to the complexity of the molecular space and the noisy phenomenon in gene expression data. To tackle these issues, a gene expression denoising (GED) model that employs contrastive objective function is first proposed to reduce noise from gene expression data. FAME is then designed to treat molecules as the sequences of fragments and learn to generate these fragments in autoregressive manner. By leveraging this fragment-based generation strategy and the denoised gene expression profiles, FAME can generate novel molecules with a high validity rate and desired biological activity. The experimental results show that FAME outperforms existing methods including both SMILES-based and graph-based deep generative models for phenotypic molecular design. Furthermore, the effective mechanism for reducing noise in gene expression data proposed in our study can be applied to omics data modeling in general for facilitating phenotypic drug discovery.

摘要

由于化学空间的复杂性,分子设计是药物研发中的一项关键挑战。随着分子数据集的可得性以及机器学习的进展,人们提出了许多深度生成模型来生成具有所需特性的新型分子。然而,现有的大多数模型仅专注于分子分布学习和基于靶点的分子设计,从而限制了它们在实际应用中的潜力。在药物研发中,表型分子设计相较于基于靶点的分子设计具有优势,尤其是在首创药物研发中。在这项工作中,我们提出了首个针对表型分子设计,特别是基于基因表达的分子设计的深度图生成模型(FAME)。FAME利用条件变分自编码器框架从基因表达谱中学习生成分子的条件分布。然而,由于分子空间的复杂性和基因表达数据中的噪声现象,这种分布很难学习。为了解决这些问题,首先提出了一种采用对比目标函数的基因表达去噪(GED)模型来减少基因表达数据中的噪声。然后,FAME被设计为将分子视为片段序列,并以自回归方式学习生成这些片段。通过利用这种基于片段的生成策略和去噪后的基因表达谱,FAME可以生成具有高有效率和所需生物活性的新型分子。实验结果表明,FAME在表型分子设计方面优于包括基于SMILES和基于图的深度生成模型在内的现有方法。此外,我们研究中提出的减少基因表达数据噪声的有效机制一般可应用于组学数据建模,以促进表型药物研发。

相似文献

2
Conditional Molecular Design with Deep Generative Models.条件分子设计与深度生成模型。
J Chem Inf Model. 2019 Jan 28;59(1):43-52. doi: 10.1021/acs.jcim.8b00263. Epub 2018 Jul 27.
9
Geometry-Based Molecular Generation With Deep Constrained Variational Autoencoder.基于几何的深度约束变分自编码器分子生成
IEEE Trans Neural Netw Learn Syst. 2024 Apr;35(4):4852-4861. doi: 10.1109/TNNLS.2022.3147790. Epub 2024 Apr 4.

本文引用的文献

4
Deep reinforcement learning for de novo drug design.基于深度强化学习的从头药物设计。
Sci Adv. 2018 Jul 25;4(7):eaap7885. doi: 10.1126/sciadv.aap7885. eCollection 2018 Jul.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验