• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于分子的图神经过程:对接分数评估及提高泛化能力的策略

Graph neural processes for molecules: an evaluation on docking scores and strategies to improve generalization.

作者信息

García-Ortegón Miguel, Seal Srijit, Rasmussen Carl, Bender Andreas, Bacallado Sergio

机构信息

Statistical Laboratory, University of Cambridge, Wilberforce Rd, Cambridge, CB3 0WA, UK.

Department of Engineering, University of Cambridge, Trumpington St, Cambridge, CB2 1PZ, UK.

出版信息

J Cheminform. 2024 Oct 23;16(1):115. doi: 10.1186/s13321-024-00904-2.

DOI:10.1186/s13321-024-00904-2
PMID:39443970
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11515514/
Abstract

Neural processes (NPs) are models for meta-learning which output uncertainty estimates. So far, most studies of NPs have focused on low-dimensional datasets of highly-correlated tasks. While these homogeneous datasets are useful for benchmarking, they may not be representative of realistic transfer learning. In particular, applications in scientific research may prove especially challenging due to the potential novelty of meta-testing tasks. Molecular property prediction is one such research area that is characterized by sparse datasets of many functions on a shared molecular space. In this paper, we study the application of graph NPs to molecular property prediction with DOCKSTRING, a diverse dataset of docking scores. Graph NPs show competitive performance in few-shot learning tasks relative to supervised learning baselines common in chemoinformatics, as well as alternative techniques for transfer learning and meta-learning. In order to increase meta-generalization to divergent test functions, we propose fine-tuning strategies that adapt the parameters of NPs. We find that adaptation can substantially increase NPs' regression performance while maintaining good calibration of uncertainty estimates. Finally, we present a Bayesian optimization experiment which showcases the potential advantages of NPs over Gaussian processes in iterative screening. Overall, our results suggest that NPs on molecular graphs hold great potential for molecular property prediction in the low-data setting. SCIENTIFIC CONTRIBUTION: Neural processes are a family of meta-learning algorithms which deal with data scarcity by transferring information across tasks and making probabilistic predictions. We evaluate their performance on regression and optimization molecular tasks using docking scores, finding them to outperform classical single-task and transfer-learning models. We examine the issue of generalization to divergent test tasks, which is a general concern of meta-learning algorithms in science, and propose strategies to alleviate it.

摘要

神经过程(NPs)是用于元学习的模型,可输出不确定性估计。到目前为止,大多数关于NPs的研究都集中在高度相关任务的低维数据集上。虽然这些同质数据集对于基准测试很有用,但它们可能无法代表现实的迁移学习。特别是,由于元测试任务的潜在新颖性,科学研究中的应用可能会特别具有挑战性。分子性质预测就是这样一个研究领域,其特点是在共享分子空间上有许多功能的稀疏数据集。在本文中,我们研究了图NPs在分子性质预测中的应用,使用了DOCKSTRING,这是一个多样化的对接分数数据集。相对于化学信息学中常见的监督学习基线以及迁移学习和元学习的替代技术,图NPs在少样本学习任务中表现出有竞争力的性能。为了增加对不同测试函数的元泛化能力,我们提出了微调策略来调整NPs的参数。我们发现,调整可以显著提高NPs的回归性能,同时保持不确定性估计的良好校准。最后,我们展示了一个贝叶斯优化实验,该实验展示了NPs在迭代筛选中相对于高斯过程的潜在优势。总体而言,我们的结果表明,分子图上的NPs在低数据设置下的分子性质预测中具有巨大潜力。

科学贡献

神经过程是一类元学习算法,通过跨任务传递信息并进行概率预测来处理数据稀缺问题。我们使用对接分数评估它们在回归和优化分子任务上的性能,发现它们优于经典的单任务和迁移学习模型。我们研究了对不同测试任务的泛化问题,这是科学中元学习算法普遍关注的问题,并提出了缓解该问题的策略。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d20c/11515514/2358e42bd0e5/13321_2024_904_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d20c/11515514/47e0df8cd06b/13321_2024_904_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d20c/11515514/791fb4b1467b/13321_2024_904_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d20c/11515514/e463fe214780/13321_2024_904_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d20c/11515514/d693e32f2db1/13321_2024_904_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d20c/11515514/7788208e7cd7/13321_2024_904_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d20c/11515514/69856db864a7/13321_2024_904_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d20c/11515514/9ecd7d55110f/13321_2024_904_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d20c/11515514/8e743096be9a/13321_2024_904_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d20c/11515514/f4222febe6aa/13321_2024_904_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d20c/11515514/f9421474ecba/13321_2024_904_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d20c/11515514/3bf1924acd90/13321_2024_904_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d20c/11515514/2358e42bd0e5/13321_2024_904_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d20c/11515514/47e0df8cd06b/13321_2024_904_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d20c/11515514/791fb4b1467b/13321_2024_904_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d20c/11515514/e463fe214780/13321_2024_904_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d20c/11515514/d693e32f2db1/13321_2024_904_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d20c/11515514/7788208e7cd7/13321_2024_904_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d20c/11515514/69856db864a7/13321_2024_904_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d20c/11515514/9ecd7d55110f/13321_2024_904_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d20c/11515514/8e743096be9a/13321_2024_904_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d20c/11515514/f4222febe6aa/13321_2024_904_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d20c/11515514/f9421474ecba/13321_2024_904_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d20c/11515514/3bf1924acd90/13321_2024_904_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d20c/11515514/2358e42bd0e5/13321_2024_904_Fig11_HTML.jpg

相似文献

1
Graph neural processes for molecules: an evaluation on docking scores and strategies to improve generalization.用于分子的图神经过程:对接分数评估及提高泛化能力的策略
J Cheminform. 2024 Oct 23;16(1):115. doi: 10.1186/s13321-024-00904-2.
2
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
3
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
4
Short-Term Memory Impairment短期记忆障碍
5
"In a State of Flow": A Qualitative Examination of Autistic Adults' Phenomenological Experiences of Task Immersion.“心流状态”:对自闭症成年人任务沉浸现象学体验的质性研究
Autism Adulthood. 2024 Sep 16;6(3):362-373. doi: 10.1089/aut.2023.0032. eCollection 2024 Sep.
6
Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.系统性药理学治疗慢性斑块状银屑病:网络荟萃分析。
Cochrane Database Syst Rev. 2021 Apr 19;4(4):CD011535. doi: 10.1002/14651858.CD011535.pub4.
7
Sexual Harassment and Prevention Training性骚扰与预防培训
8
The Black Book of Psychotropic Dosing and Monitoring.《精神药物剂量与监测黑皮书》
Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.
9
Systemic treatments for metastatic cutaneous melanoma.转移性皮肤黑色素瘤的全身治疗
Cochrane Database Syst Rev. 2018 Feb 6;2(2):CD011123. doi: 10.1002/14651858.CD011123.pub2.
10
Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.慢性斑块状银屑病的全身药理学治疗:一项网状Meta分析。
Cochrane Database Syst Rev. 2020 Jan 9;1(1):CD011535. doi: 10.1002/14651858.CD011535.pub3.

本文引用的文献

1
Understanding and Quantifying Molecular Flexibility: Torsion Angular Bin Strings.理解和量化分子柔性:扭转角 bin 字符串。
J Chem Inf Model. 2024 Oct 28;64(20):7917-7924. doi: 10.1021/acs.jcim.4c01513. Epub 2024 Oct 10.
2
Can large language models understand molecules?大语言模型能理解分子吗?
BMC Bioinformatics. 2024 Jun 26;25(1):225. doi: 10.1186/s12859-024-05847-x.
3
Combining IC or Values from Different Sources Is a Source of Significant Noise.合并来自不同来源的IC或值是显著噪声的一个来源。
J Chem Inf Model. 2024 Mar 11;64(5):1560-1567. doi: 10.1021/acs.jcim.4c00049. Epub 2024 Feb 23.
4
Costs and Causes of Oncology Drug Attrition With the Example of Insulin-Like Growth Factor-1 Receptor Inhibitors.以胰岛素样生长因子-1 受体抑制剂为例探讨肿瘤药物淘汰的成本和原因。
JAMA Netw Open. 2023 Jul 3;6(7):e2324977. doi: 10.1001/jamanetworkopen.2023.24977.
5
Large-Scale Modeling of Sparse Protein Kinase Activity Data.大规模稀疏蛋白激酶活性数据建模。
J Chem Inf Model. 2023 Jun 26;63(12):3688-3696. doi: 10.1021/acs.jcim.3c00132. Epub 2023 Jun 9.
6
Using chemical and biological data to predict drug toxicity.利用化学和生物数据预测药物毒性。
SLAS Discov. 2023 Apr;28(3):53-64. doi: 10.1016/j.slasd.2022.12.003. Epub 2023 Jan 11.
7
DOCKSTRING: Easy Molecular Docking Yields Better Benchmarks for Ligand Design.DOCKSTRING:易于分子对接可为配体设计提供更好的基准。
J Chem Inf Model. 2022 Aug 8;62(15):3486-3502. doi: 10.1021/acs.jcim.1c01334. Epub 2022 Jul 18.
8
Analysis of the benefits of imputation models over traditional QSAR models for toxicity prediction.插补模型相对于传统定量构效关系(QSAR)模型在毒性预测方面的优势分析。
J Cheminform. 2022 Jun 7;14(1):32. doi: 10.1186/s13321-022-00611-w.
9
A compact review of molecular property prediction with graph neural networks.图神经网络在分子性质预测中的应用综述
Drug Discov Today Technol. 2020 Dec;37:1-12. doi: 10.1016/j.ddtec.2020.11.009. Epub 2020 Dec 17.
10
Splitting chemical structure data sets for federated privacy-preserving machine learning.用于联邦隐私保护机器学习的化学结构数据集拆分
J Cheminform. 2021 Dec 7;13(1):96. doi: 10.1186/s13321-021-00576-2.