• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

跨模态分子结构与扰动诱导转录谱的表示对齐。

Cross-modal representation alignment of molecular structure and perturbation-induced transcriptional profiles.

机构信息

Department of Systems, Synthetic, and Quantitative Biology, Harvard Medical School, Boston, MA, USA2Department of EECS, Massachusetts Institute of Technology, Cambridge, MA, USA*Co-first author.

出版信息

Pac Symp Biocomput. 2021;26:273-284.

PMID:33691024
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8397230/
Abstract

Modeling the relationship between chemical structure and molecular activity is a key goal in drug development. Many benchmark tasks have been proposed for molecular property prediction, but these tasks are generally aimed at specific, isolated biomedical properties. In this work, we propose a new cross-modal small molecule retrieval task, designed to force a model to learn to associate the structure of a small molecule with the transcriptional change it induces. We develop this task formally as multi-view alignment problem, and present a coordinated deep learning approach that jointly optimizes representations of both chemical structure and perturbational gene expression profiles. We benchmark our results against oracle models and principled baselines, and find that cell line variability markedly influences performance in this domain. Our work establishes the feasibility of this new task, elucidates the limitations of current data and systems, and may serve to catalyze future research in small molecule representation learning.

摘要

建立化学结构与分子活性之间的关系模型是药物研发的关键目标。已经提出了许多用于分子性质预测的基准任务,但这些任务通常针对特定的、孤立的生物医学性质。在这项工作中,我们提出了一个新的跨模态小分子检索任务,旨在迫使模型学会将小分子的结构与其诱导的转录变化联系起来。我们将这个任务正式地形式化为多视图对齐问题,并提出了一种协调的深度学习方法,该方法联合优化了化学结构和扰动基因表达谱的表示。我们将我们的结果与 oracle 模型和有原则的基准进行了对比,并发现细胞系的变异性显著影响了该领域的性能。我们的工作确立了这个新任务的可行性,阐明了当前数据和系统的局限性,并可能有助于推动小分子表示学习的未来研究。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb09/8397230/0944371265dc/nihms-1649365-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb09/8397230/18e84b8bf3e4/nihms-1649365-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb09/8397230/610ed6d26a11/nihms-1649365-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb09/8397230/0944371265dc/nihms-1649365-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb09/8397230/18e84b8bf3e4/nihms-1649365-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb09/8397230/610ed6d26a11/nihms-1649365-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb09/8397230/0944371265dc/nihms-1649365-f0003.jpg

相似文献

1
Cross-modal representation alignment of molecular structure and perturbation-induced transcriptional profiles.跨模态分子结构与扰动诱导转录谱的表示对齐。
Pac Symp Biocomput. 2021;26:273-284.
2
Fine-Grained Cross-Modal Semantic Consistency in Natural Conservation Image Data from a Multi-Task Perspective.从多任务视角看自然保护图像数据中的细粒度跨模态语义一致性
Sensors (Basel). 2024 May 14;24(10):3130. doi: 10.3390/s24103130.
3
MvMRL: a multi-view molecular representation learning method for molecular property prediction.MvMRL:一种用于分子性质预测的多视角分子表示学习方法。
Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae298.
4
MMGCN: Multi-modal multi-view graph convolutional networks for cancer prognosis prediction.多模态多视图图卷积网络用于癌症预后预测。
Comput Methods Programs Biomed. 2024 Dec;257:108400. doi: 10.1016/j.cmpb.2024.108400. Epub 2024 Sep 6.
5
Learning Multi-Modal Nonlinear Embeddings: Performance Bounds and an Algorithm.学习多模态非线性嵌入:性能界限与一种算法
IEEE Trans Image Process. 2021;30:4384-4394. doi: 10.1109/TIP.2021.3071688. Epub 2021 Apr 21.
6
Hypergraph-Based Multi-Modal Representation for Open-Set 3D Object Retrieval.基于超图的开放集3D物体检索多模态表示
IEEE Trans Pattern Anal Mach Intell. 2024 Apr;46(4):2206-2223. doi: 10.1109/TPAMI.2023.3332768. Epub 2024 Mar 6.
7
Learning Molecular Representations for Medicinal Chemistry.学习药物化学的分子表示法。
J Med Chem. 2020 Aug 27;63(16):8705-8722. doi: 10.1021/acs.jmedchem.0c00385. Epub 2020 May 15.
8
Harmonized Multimodal Learning with Gaussian Process Latent Variable Models.基于高斯过程潜变量模型的协调多模态学习。
IEEE Trans Pattern Anal Mach Intell. 2021 Mar;43(3):858-872. doi: 10.1109/TPAMI.2019.2942028. Epub 2021 Feb 4.
9
Multi-Modal Deep Representation Learning Accurately Identifies and Interprets Drug-Target Interactions.多模态深度表征学习准确识别和解释药物-靶点相互作用。
IEEE J Biomed Health Inform. 2025 Jul;29(7):5350-5360. doi: 10.1109/JBHI.2025.3553217.
10
Cross-Modal 3D Shape Retrieval via Heterogeneous Dynamic Graph Representation.基于异构动态图表示的跨模态3D形状检索
IEEE Trans Pattern Anal Mach Intell. 2025 Apr;47(4):2370-2387. doi: 10.1109/TPAMI.2024.3524440. Epub 2025 Mar 6.

引用本文的文献

1
Predicting mechanism of action of novel compounds using compound structure and transcriptomic signature coembedding.使用化合物结构和转录组特征共嵌入预测新型化合物的作用机制。
Bioinformatics. 2021 Jul 12;37(Suppl_1):i376-i382. doi: 10.1093/bioinformatics/btab275.

本文引用的文献

1
Heterogeneous Network Representation Learning: A Unified Framework with Survey and Benchmark.异构网络表示学习:一个包含综述与基准测试的统一框架
IEEE Trans Knowl Data Eng. 2022 Oct;34(10):4854-4873. doi: 10.1109/tkde.2020.3045924. Epub 2020 Dec 21.
2
GNNExplainer: Generating Explanations for Graph Neural Networks.GNNExplainer:为图神经网络生成解释
Adv Neural Inf Process Syst. 2019 Dec;32:9240-9251.
3
Analyzing Learned Molecular Representations for Property Prediction.分析用于性质预测的学习分子表示。
J Chem Inf Model. 2019 Aug 26;59(8):3370-3388. doi: 10.1021/acs.jcim.9b00237. Epub 2019 Aug 13.
4
Deep Learning Benchmarks on L1000 Gene Expression Data.基于 L1000 基因表达数据的深度学习基准测试
IEEE/ACM Trans Comput Biol Bioinform. 2020 Nov-Dec;17(6):1846-1857. doi: 10.1109/TCBB.2019.2910061. Epub 2020 Dec 8.
5
Dr.VAE: improving drug response prediction via modeling of drug perturbation effects.VAE 博士:通过建模药物干扰效应来改善药物反应预测。
Bioinformatics. 2019 Oct 1;35(19):3743-3751. doi: 10.1093/bioinformatics/btz158.
6
Systematic polypharmacology and drug repurposing via an integrated L1000-based Connectivity Map database mining.通过基于L1000的综合连通性图谱数据库挖掘进行系统性多药理学和药物再利用。
R Soc Open Sci. 2018 Nov 28;5(11):181321. doi: 10.1098/rsos.181321. eCollection 2018 Nov.
7
Deep generative modeling for single-cell transcriptomics.单细胞转录组学的深度生成模型。
Nat Methods. 2018 Dec;15(12):1053-1058. doi: 10.1038/s41592-018-0229-2. Epub 2018 Nov 30.
8
Harnessing the biological complexity of Big Data from LINCS gene expression signatures.从 LINCS 基因表达特征中利用大数据的生物学复杂性。
PLoS One. 2018 Aug 29;13(8):e0201937. doi: 10.1371/journal.pone.0201937. eCollection 2018.
9
Drug Repurposing Using Deep Embeddings of Gene Expression Profiles.基于基因表达谱的深度学习嵌入的药物重定位。
Mol Pharm. 2018 Oct 1;15(10):4314-4325. doi: 10.1021/acs.molpharmaceut.8b00284. Epub 2018 Aug 7.
10
Triplet-Based Deep Hashing Network for Cross-Modal Retrieval.用于跨模态检索的基于三元组的深度哈希网络。
IEEE Trans Image Process. 2018 Aug;27(8):3893-3903. doi: 10.1109/TIP.2018.2821921. Epub 2018 Apr 4.