• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于图神经网络的迁移学习在多保真度环境下提高分子性质预测

Transfer learning with graph neural networks for improved molecular property prediction in the multi-fidelity setting.

机构信息

Department of Computer Science and Technology, University of Cambridge, Cambridge, UK.

Molecular AI, BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden.

出版信息

Nat Commun. 2024 Feb 26;15(1):1517. doi: 10.1038/s41467-024-45566-8.

DOI:10.1038/s41467-024-45566-8
PMID:38409255
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11258334/
Abstract

We investigate the potential of graph neural networks for transfer learning and improving molecular property prediction on sparse and expensive to acquire high-fidelity data by leveraging low-fidelity measurements as an inexpensive proxy for a targeted property of interest. This problem arises in discovery processes that rely on screening funnels for trading off the overall costs against throughput and accuracy. Typically, individual stages in these processes are loosely connected and each one generates data at different scale and fidelity. We consider this setup holistically and demonstrate empirically that existing transfer learning techniques for graph neural networks are generally unable to harness the information from multi-fidelity cascades. Here, we propose several effective transfer learning strategies and study them in transductive and inductive settings. Our analysis involves a collection of more than 28 million unique experimental protein-ligand interactions across 37 targets from drug discovery by high-throughput screening and 12 quantum properties from the dataset QMugs. The results indicate that transfer learning can improve the performance on sparse tasks by up to eight times while using an order of magnitude less high-fidelity training data. Moreover, the proposed methods consistently outperform existing transfer learning strategies for graph-structured data on drug discovery and quantum mechanics datasets.

摘要

我们研究了图神经网络在迁移学习方面的潜力,通过利用低保真度测量作为目标感兴趣属性的廉价代理,来提高对稀疏且昂贵的高保真度数据的分子性质预测能力。在依赖筛选漏斗来平衡总成本、吞吐量和准确性的发现过程中,会出现这个问题。通常,这些过程中的各个阶段都是松散连接的,每个阶段都会在不同的规模和保真度上生成数据。我们全面考虑了这种设置,并通过经验证明,现有的图神经网络迁移学习技术通常无法利用多保真度级联中的信息。在这里,我们提出了几种有效的迁移学习策略,并在传导和归纳设置中对它们进行了研究。我们的分析涉及了高通量筛选药物发现中来自 37 个靶标的超过 2800 万个独特的实验蛋白质-配体相互作用,以及来自数据集 QMugs 的 12 个量子性质。结果表明,迁移学习可以在使用数量级少得多的高保真训练数据的情况下,将稀疏任务的性能提高多达 8 倍。此外,在所研究的药物发现和量子力学数据集上,所提出的方法在图结构数据的迁移学习策略方面始终表现出色。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e772/11258334/6e517080cfdd/41467_2024_45566_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e772/11258334/afd6286f963a/41467_2024_45566_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e772/11258334/490b66b3d567/41467_2024_45566_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e772/11258334/c64d8ca85c13/41467_2024_45566_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e772/11258334/b808dd5acbd4/41467_2024_45566_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e772/11258334/87701a679139/41467_2024_45566_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e772/11258334/bc9b05fd0680/41467_2024_45566_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e772/11258334/6e517080cfdd/41467_2024_45566_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e772/11258334/afd6286f963a/41467_2024_45566_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e772/11258334/490b66b3d567/41467_2024_45566_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e772/11258334/c64d8ca85c13/41467_2024_45566_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e772/11258334/b808dd5acbd4/41467_2024_45566_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e772/11258334/87701a679139/41467_2024_45566_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e772/11258334/bc9b05fd0680/41467_2024_45566_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e772/11258334/6e517080cfdd/41467_2024_45566_Fig7_HTML.jpg

相似文献

1
Transfer learning with graph neural networks for improved molecular property prediction in the multi-fidelity setting.基于图神经网络的迁移学习在多保真度环境下提高分子性质预测
Nat Commun. 2024 Feb 26;15(1):1517. doi: 10.1038/s41467-024-45566-8.
2
Improved GNNs for Log  Prediction by Transferring Knowledge from Low-Fidelity Data.通过从低质量数据转移知识来改进图神经网络进行日志预测。
J Chem Inf Model. 2023 Apr 24;63(8):2345-2359. doi: 10.1021/acs.jcim.2c01564. Epub 2023 Mar 31.
3
GraphEGFR: Multi-task and transfer learning based on molecular graph attention mechanism and fingerprints improving inhibitor bioactivity prediction for EGFR family proteins on data scarcity.GraphEGFR:基于分子图注意力机制和指纹的多任务和迁移学习,在数据稀缺的情况下提高了针对 EGFR 家族蛋白的抑制剂生物活性预测。
J Comput Chem. 2024 Sep 5;45(23):2001-2023. doi: 10.1002/jcc.27388. Epub 2024 May 7.
4
On Inductive-Transductive Learning With Graph Neural Networks.基于图神经网络的归纳-演绎学习。
IEEE Trans Pattern Anal Mach Intell. 2022 Feb;44(2):758-769. doi: 10.1109/TPAMI.2021.3054304. Epub 2022 Jan 7.
5
Pre-training graph neural networks for link prediction in biomedical networks.用于生物医学网络中链接预测的预训练图神经网络。
Bioinformatics. 2022 Apr 12;38(8):2254-2262. doi: 10.1093/bioinformatics/btac100.
6
MF-PCBA: Multifidelity High-Throughput Screening Benchmarks for Drug Discovery and Machine Learning.MF-PCBA:药物发现和机器学习的多保真度高通量筛选基准
J Chem Inf Model. 2023 May 8;63(9):2667-2678. doi: 10.1021/acs.jcim.2c01569. Epub 2023 Apr 14.
7
miDruglikeness: Subdivisional Drug-Likeness Prediction Models Using Active Ensemble Learning Strategies.miDruglikeness:基于主动集成学习策略的细分药物相似性预测模型。
Biomolecules. 2022 Dec 23;13(1):29. doi: 10.3390/biom13010029.
8
Algebraic graph-assisted bidirectional transformers for molecular property prediction.基于代数图辅助的双向转换器在分子性质预测中的应用。
Nat Commun. 2021 Jun 10;12(1):3521. doi: 10.1038/s41467-021-23720-w.
9
Evaluating the Use of Graph Neural Networks and Transfer Learning for Oral Bioavailability Prediction.评估图神经网络和迁移学习在口服生物利用度预测中的应用。
J Chem Inf Model. 2023 Aug 28;63(16):5035-5044. doi: 10.1021/acs.jcim.3c00554. Epub 2023 Aug 15.
10
GeneralizedDTA: combining pre-training and multi-task learning to predict drug-target binding affinity for unknown drug discovery.通用 DTA:结合预训练和多任务学习,预测未知药物发现的药物-靶标结合亲和力。
BMC Bioinformatics. 2022 Sep 7;23(1):367. doi: 10.1186/s12859-022-04905-6.

引用本文的文献

1
Predicting reaction conditions: a data-driven perspective.预测反应条件:数据驱动的视角
Chem Sci. 2025 Aug 6. doi: 10.1039/d5sc03045e.
2
I‑GAT: Interpretable Graph Attention Networks for Ligand Optimization.I‑GAT:用于配体优化的可解释图注意力网络
ACS Omega. 2025 Jul 21;10(30):32968-32986. doi: 10.1021/acsomega.5c02173. eCollection 2025 Aug 5.
3
Multi-fidelity graph neural networks for predicting toluene/water partition coefficients.用于预测甲苯/水分配系数的多保真度图神经网络

本文引用的文献

1
Learning properties of ordered and disordered materials from multi-fidelity data.从多保真度数据中学习有序和无序材料的特性。
Nat Comput Sci. 2021 Jan;1(1):46-53. doi: 10.1038/s43588-020-00002-x. Epub 2021 Jan 14.
2
Chemprop: A Machine Learning Package for Chemical Property Prediction.Chemprop:一个用于化学性质预测的机器学习工具包。
J Chem Inf Model. 2024 Jan 8;64(1):9-17. doi: 10.1021/acs.jcim.3c01250. Epub 2023 Dec 26.
3
Scaling deep learning for materials discovery.深度学习在材料发现中的应用。
J Cheminform. 2025 Aug 8;17(1):123. doi: 10.1186/s13321-025-01057-6.
4
MCST-AFN: A Multichannel Spatiotemporal Feature Adaptive Fusion Network Framework Based on a Low-Fidelity Molecular Dynamics Model.MCST-AFN:一种基于低精度分子动力学模型的多通道时空特征自适应融合网络框架
ACS Omega. 2025 Jul 11;10(28):30232-30249. doi: 10.1021/acsomega.5c01443. eCollection 2025 Jul 22.
5
Artificial neural network-driven approaches to improved forecasting of disability care expenditures in an aging Kingdom of Saudi Arabia population.人工神经网络驱动的方法用于改善对沙特阿拉伯王国老龄化人口残疾护理支出的预测。
Sci Rep. 2025 Jul 1;15(1):20538. doi: 10.1038/s41598-025-05364-8.
6
AI-Driven Drug Discovery: A Comprehensive Review.人工智能驱动的药物发现:全面综述。
ACS Omega. 2025 Jun 6;10(23):23889-23903. doi: 10.1021/acsomega.5c00549. eCollection 2025 Jun 17.
7
Q-GEM: Quantum Chemistry Knowledge Fusion Geometry-Enhanced Molecular Representation for Property Prediction.Q-GEM:用于性质预测的量子化学知识融合几何增强分子表示法。
Adv Sci (Weinh). 2025 Sep;12(33):e04867. doi: 10.1002/advs.202504867. Epub 2025 Jun 20.
8
An end-to-end attention-based approach for learning on graphs.一种基于端到端注意力机制的图学习方法。
Nat Commun. 2025 Jun 5;16(1):5244. doi: 10.1038/s41467-025-60252-z.
9
Improved Machine Learning Predictions of EC50s Using Uncertainty Estimation from Dose-Response Data.利用剂量反应数据的不确定性估计改进机器学习对半数有效浓度(EC50)的预测
J Chem Inf Model. 2025 Jun 9;65(11):5623-5634. doi: 10.1021/acs.jcim.5c00249. Epub 2025 May 19.
10
Distinguishing critical microbial community shifts from normal temporal variability in human and environmental ecosystems.区分人类和环境生态系统中关键微生物群落变化与正常时间变异性。
Sci Rep. 2025 May 15;15(1):16934. doi: 10.1038/s41598-025-01781-x.
Nature. 2023 Dec;624(7990):80-85. doi: 10.1038/s41586-023-06735-9. Epub 2023 Nov 29.
4
Modelling local and general quantum mechanical properties with attention-based pooling.利用基于注意力的池化对局部和一般量子力学性质进行建模。
Commun Chem. 2023 Nov 29;6(1):262. doi: 10.1038/s42004-023-01045-7.
5
Physics-inspired machine learning of localized intensive properties.基于物理启发的局部强度性质的机器学习
Chem Sci. 2023 Apr 10;14(18):4913-4922. doi: 10.1039/d3sc00841j. eCollection 2023 May 10.
6
Discovering small-molecule senolytics with deep neural networks.利用深度神经网络发现小分子衰老细胞清除剂。
Nat Aging. 2023 Jun;3(6):734-750. doi: 10.1038/s43587-023-00415-z. Epub 2023 May 4.
7
MF-PCBA: Multifidelity High-Throughput Screening Benchmarks for Drug Discovery and Machine Learning.MF-PCBA:药物发现和机器学习的多保真度高通量筛选基准
J Chem Inf Model. 2023 May 8;63(9):2667-2678. doi: 10.1021/acs.jcim.2c01569. Epub 2023 Apr 14.
8
nablaDFT: Large-Scale Conformational Energy and Hamiltonian Prediction benchmark and dataset.nablaDFT:大规模构象能量与哈密顿量预测基准及数据集
Phys Chem Chem Phys. 2022 Nov 2;24(42):25853-25863. doi: 10.1039/d2cp03966d.
9
QMugs, quantum mechanical properties of drug-like molecules.QMugs,类药物分子的量子力学性质。
Sci Data. 2022 Jun 7;9(1):273. doi: 10.1038/s41597-022-01390-7.
10
Advancing automation in high-throughput screening: Modular unguarded systems enable adaptable drug discovery.推进高通量筛选中的自动化:模块化无防护系统实现可适应的药物发现。
Drug Discov Today. 2022 Aug;27(8):2051-2056. doi: 10.1016/j.drudis.2022.03.010. Epub 2022 Mar 15.