• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于将结构层次整合到上下文相关分子表示中的多通道学习。

Multi-channel learning for integrating structural hierarchies into context-dependent molecular representation.

作者信息

Wan Yue, Wu Jialu, Hou Tingjun, Hsieh Chang-Yu, Jia Xiaowei

机构信息

University of Pittsburgh, Department of Computer Science, Pittsburgh, PA, 15260, USA.

Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China.

出版信息

Nat Commun. 2025 Jan 6;16(1):413. doi: 10.1038/s41467-024-55082-4.

DOI:10.1038/s41467-024-55082-4
PMID:39762223
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11704287/
Abstract

Reliable molecular property prediction is essential for various scientific endeavors and industrial applications, such as drug discovery. However, the data scarcity, combined with the highly non-linear causal relationships between physicochemical and biological properties and conventional molecular featurization schemes, complicates the development of robust molecular machine learning models. Self-supervised learning (SSL) has emerged as a popular solution, utilizing large-scale, unannotated molecular data to learn a foundational representation of chemical space that might be advantageous for downstream tasks. Yet, existing molecular SSL methods largely overlook chemical knowledge, including molecular structure similarity, scaffold composition, and the context-dependent aspects of molecular properties when operating over the chemical space. They also struggle to learn the subtle variations in structure-activity relationship. This paper introduces a multi-channel pre-training framework that learns robust and generalizable chemical knowledge. It leverages the structural hierarchy within the molecule, embeds them through distinct pre-training tasks across channels, and aggregates channel information in a task-specific manner during fine-tuning. Our approach demonstrates competitive performance across various molecular property benchmarks and offers strong advantages in particularly challenging yet ubiquitous scenarios like activity cliffs.

摘要

可靠的分子性质预测对于各种科学研究和工业应用(如药物发现)至关重要。然而,数据稀缺,再加上物理化学性质与生物学性质之间高度非线性的因果关系以及传统的分子特征化方案,使得开发强大的分子机器学习模型变得复杂。自监督学习(SSL)已成为一种流行的解决方案,利用大规模、未标注的分子数据来学习化学空间的基础表示,这可能对下游任务有利。然而,现有的分子SSL方法在处理化学空间时,很大程度上忽略了化学知识,包括分子结构相似性、骨架组成以及分子性质的上下文相关方面。它们也难以学习结构-活性关系中的细微变化。本文介绍了一种多通道预训练框架,该框架可以学习到强大且通用的化学知识。它利用分子内部的结构层次,通过跨通道的不同预训练任务对其进行嵌入,并在微调期间以特定任务的方式聚合通道信息。我们的方法在各种分子性质基准测试中展现出具有竞争力的性能,并且在像活性悬崖这样特别具有挑战性但又普遍存在的场景中具有显著优势。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eea3/11704287/a5daa7d9b6c9/41467_2024_55082_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eea3/11704287/4266fb113259/41467_2024_55082_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eea3/11704287/efef148e71e2/41467_2024_55082_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eea3/11704287/be7f12e32852/41467_2024_55082_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eea3/11704287/2be699af4deb/41467_2024_55082_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eea3/11704287/a5daa7d9b6c9/41467_2024_55082_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eea3/11704287/4266fb113259/41467_2024_55082_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eea3/11704287/efef148e71e2/41467_2024_55082_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eea3/11704287/be7f12e32852/41467_2024_55082_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eea3/11704287/2be699af4deb/41467_2024_55082_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eea3/11704287/a5daa7d9b6c9/41467_2024_55082_Fig5_HTML.jpg

相似文献

1
Multi-channel learning for integrating structural hierarchies into context-dependent molecular representation.用于将结构层次整合到上下文相关分子表示中的多通道学习。
Nat Commun. 2025 Jan 6;16(1):413. doi: 10.1038/s41467-024-55082-4.
2
Hierarchical Molecular Graph Self-Supervised Learning for property prediction.用于属性预测的分层分子图自监督学习
Commun Chem. 2023 Feb 17;6(1):34. doi: 10.1038/s42004-023-00825-5.
3
GeneralizedDTA: combining pre-training and multi-task learning to predict drug-target binding affinity for unknown drug discovery.通用 DTA:结合预训练和多任务学习,预测未知药物发现的药物-靶标结合亲和力。
BMC Bioinformatics. 2022 Sep 7;23(1):367. doi: 10.1186/s12859-022-04905-6.
4
ReFs: A hybrid pre-training paradigm for 3D medical image segmentation.参考文献:一种用于3D医学图像分割的混合预训练范式。
Med Image Anal. 2024 Jan;91:103023. doi: 10.1016/j.media.2023.103023. Epub 2023 Nov 8.
5
Positional embeddings and zero-shot learning using BERT for molecular-property prediction.使用BERT进行位置嵌入和零样本学习以预测分子性质
J Cheminform. 2025 Feb 5;17(1):17. doi: 10.1186/s13321-025-00959-9.
6
Cluster-based histopathology phenotype representation learning by self-supervised multi-class-token hierarchical ViT.基于聚类的组织病理学表型表示学习的自监督多类别标记层次化 ViT。
Sci Rep. 2024 Feb 8;14(1):3202. doi: 10.1038/s41598-024-53361-0.
7
Self-Supervised Molecular Representation Learning With Topology and Geometry.基于拓扑和几何的自监督分子表示学习
IEEE J Biomed Health Inform. 2025 Jan;29(1):700-710. doi: 10.1109/JBHI.2024.3479194. Epub 2025 Jan 7.
8
Self-Supervised Pre-Training via Multi-View Graph Information Bottleneck for Molecular Property Prediction.基于多视图图信息瓶颈的自监督预训练用于分子性质预测
IEEE J Biomed Health Inform. 2024 Dec;28(12):7659-7669. doi: 10.1109/JBHI.2024.3422488. Epub 2024 Dec 5.
9
Learning self-supervised molecular representations for drug-drug interaction prediction.学习用于药物-药物相互作用预测的自监督分子表示。
BMC Bioinformatics. 2024 Jan 30;25(1):47. doi: 10.1186/s12859-024-05643-7.
10
3D graph contrastive learning for molecular property prediction.基于 3D 图对比学习的分子性质预测。
Bioinformatics. 2022 Jan 1;39(6). doi: 10.1093/bioinformatics/btad371.

本文引用的文献

1
Can Pretrained Models Really Learn Better Molecular Representations for AI-Aided Drug Discovery?预训练模型真的能为 AI 辅助药物发现学习更好的分子表示吗?
J Chem Inf Model. 2024 Apr 8;64(7):2921-2930. doi: 10.1021/acs.jcim.3c01707. Epub 2023 Dec 25.
2
A knowledge-guided pre-training framework for improving molecular representation learning.一种基于知识引导的预训练框架,用于改进分子表示学习。
Nat Commun. 2023 Nov 21;14(1):7568. doi: 10.1038/s41467-023-43214-1.
3
A systematic study of key elements underlying molecular property prediction.
对分子性质预测背后关键要素的系统研究。
Nat Commun. 2023 Oct 13;14(1):6395. doi: 10.1038/s41467-023-41948-6.
4
Exposing the Limitations of Molecular Machine Learning with Activity Cliffs.利用活性悬崖揭示分子机器学习的局限性。
J Chem Inf Model. 2022 Dec 12;62(23):5938-5951. doi: 10.1021/acs.jcim.2c01073. Epub 2022 Dec 1.
5
Roughness of Molecular Property Landscapes and Its Impact on Modellability.分子性质景观的粗糙度及其对可建模性的影响。
J Chem Inf Model. 2022 Oct 10;62(19):4660-4671. doi: 10.1021/acs.jcim.2c00903. Epub 2022 Sep 16.
6
Artificial intelligence-enabled virtual screening of ultra-large chemical libraries with deep docking.基于深度对接的人工智能辅助超大规模化学库虚拟筛选。
Nat Protoc. 2022 Mar;17(3):672-697. doi: 10.1038/s41596-021-00659-2. Epub 2022 Feb 4.
7
"Molecular Anatomy": a new multi-dimensional hierarchical scaffold analysis tool.“分子解剖学”:一种新型的多维分层支架分析工具。
J Cheminform. 2021 Jul 23;13:54. doi: 10.1186/s13321-021-00526-y. eCollection 2021.
8
CReM: chemically reasonable mutations framework for structure generation.CReM:用于结构生成的化学合理突变框架
J Cheminform. 2020 Apr 22;12(1):28. doi: 10.1186/s13321-020-00431-w.
9
GNNExplainer: Generating Explanations for Graph Neural Networks.GNNExplainer:为图神经网络生成解释
Adv Neural Inf Process Syst. 2019 Dec;32:9240-9251.
10
Experimental Error, Kurtosis, Activity Cliffs, and Methodology: What Limits the Predictivity of Quantitative Structure-Activity Relationship Models?实验误差、峰态、活性悬崖和方法学:是什么限制了定量构效关系模型的预测能力?
J Chem Inf Model. 2020 Apr 27;60(4):1969-1982. doi: 10.1021/acs.jcim.9b01067. Epub 2020 Apr 15.