• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过西格玛分布图的随机机器学习构建数字化学空间。

Stochastic machine learning via sigma profiles to build a digital chemical space.

作者信息

Abranches Dinis O, Maginn Edward J, Colón Yamil J

机构信息

Department of Chemical and Biomolecular Engineering, University of Notre Dame, Notre Dame, IN 46556.

出版信息

Proc Natl Acad Sci U S A. 2024 Jul 30;121(31):e2404676121. doi: 10.1073/pnas.2404676121. Epub 2024 Jul 23.

DOI:10.1073/pnas.2404676121
PMID:39042681
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11295021/
Abstract

This work establishes a different paradigm on digital molecular spaces and their efficient navigation by exploiting sigma profiles. To do so, the remarkable capability of Gaussian processes (GPs), a type of stochastic machine learning model, to correlate and predict physicochemical properties from sigma profiles is demonstrated, outperforming state-of-the-art neural networks previously published. The amount of chemical information encoded in sigma profiles eases the learning burden of machine learning models, permitting the training of GPs on small datasets which, due to their negligible computational cost and ease of implementation, are ideal models to be combined with optimization tools such as gradient search or Bayesian optimization (BO). Gradient search is used to efficiently navigate the sigma profile digital space, quickly converging to local extrema of target physicochemical properties. While this requires the availability of pretrained GP models on existing datasets, such limitations are eliminated with the implementation of BO, which can find global extrema with a limited number of iterations. A remarkable example of this is that of BO toward boiling temperature optimization. Holding no knowledge of chemistry except for the sigma profile and boiling temperature of carbon monoxide (the worst possible initial guess), BO finds the global maximum of the available boiling temperature dataset (over 1,000 molecules encompassing more than 40 families of organic and inorganic compounds) in just 15 iterations (i.e., 15 property measurements), cementing sigma profiles as a powerful digital chemical space for molecular optimization and discovery, particularly when little to no experimental data is initially available.

摘要

这项工作通过利用西格玛谱建立了关于数字分子空间及其高效导航的不同范式。为此,展示了高斯过程(GPs)这种随机机器学习模型从西格玛谱关联和预测物理化学性质的卓越能力,其性能优于先前发表的最先进神经网络。西格玛谱中编码的化学信息量减轻了机器学习模型的学习负担,使得能够在小数据集上训练高斯过程,由于其计算成本可忽略不计且易于实现,高斯过程是与梯度搜索或贝叶斯优化(BO)等优化工具相结合的理想模型。梯度搜索用于在西格玛谱数字空间中高效导航,快速收敛到目标物理化学性质的局部极值。虽然这需要在现有数据集上有预训练的高斯过程模型,但通过实施贝叶斯优化消除了这些限制,贝叶斯优化可以通过有限次数的迭代找到全局极值。一个显著的例子是贝叶斯优化用于沸点优化。除了一氧化碳的西格玛谱和沸点(最糟糕的初始猜测)之外对化学知识一无所知,贝叶斯优化在仅15次迭代(即15次性质测量)中就找到了可用沸点数据集(包含40多个有机和无机化合物家族的1000多个分子)的全局最大值,巩固了西格玛谱作为用于分子优化和发现的强大数字化学空间的地位,特别是在最初几乎没有实验数据可用的情况下。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/42f8/11295021/30470a0f746f/pnas.2404676121fig03.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/42f8/11295021/c4a4e725719d/pnas.2404676121fig01.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/42f8/11295021/38ae790e4b92/pnas.2404676121fig02.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/42f8/11295021/30470a0f746f/pnas.2404676121fig03.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/42f8/11295021/c4a4e725719d/pnas.2404676121fig01.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/42f8/11295021/38ae790e4b92/pnas.2404676121fig02.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/42f8/11295021/30470a0f746f/pnas.2404676121fig03.jpg

相似文献

1
Stochastic machine learning via sigma profiles to build a digital chemical space.通过西格玛分布图的随机机器学习构建数字化学空间。
Proc Natl Acad Sci U S A. 2024 Jul 30;121(31):e2404676121. doi: 10.1073/pnas.2404676121. Epub 2024 Jul 23.
2
Boosting Graph Neural Networks with Molecular Mechanics: A Case Study of Sigma Profile Prediction.用分子力学增强图神经网络:西格玛轮廓预测的案例研究
J Chem Theory Comput. 2023 Dec 26;19(24):9318-9328. doi: 10.1021/acs.jctc.3c01003. Epub 2023 Dec 8.
3
Interpretable Machine Learning Models for Molecular Design of Tyrosine Kinase Inhibitors Using Variational Autoencoders and Perturbation-Based Approach of Chemical Space Exploration.基于变分自动编码器和基于扰动的化学空间探索方法的酪氨酸激酶抑制剂分子设计可解释机器学习模型。
Int J Mol Sci. 2022 Sep 24;23(19):11262. doi: 10.3390/ijms231911262.
4
Using Dynamic Bayesian Optimization to Induce Desired Effects in the Presence of Motor Learning: a Simulation Study.在存在运动学习的情况下使用动态贝叶斯优化诱导期望效应:一项模拟研究。
bioRxiv. 2024 Aug 16:2024.08.13.607783. doi: 10.1101/2024.08.13.607783.
5
Bayesian Optimization for Efficient Prediction of Gas Uptake in Nanoporous Materials.用于高效预测纳米多孔材料中气体吸收的贝叶斯优化
Chemphyschem. 2024 Aug 19;25(16):e202300850. doi: 10.1002/cphc.202300850. Epub 2024 Jul 24.
6
Co-Learning Bayesian Optimization.协同学习贝叶斯优化
IEEE Trans Cybern. 2022 Sep;52(9):9820-9833. doi: 10.1109/TCYB.2022.3168551. Epub 2022 Aug 18.
7
Optimizing Machine Learning Algorithms for Landslide Susceptibility Mapping along the Karakoram Highway, Gilgit Baltistan, Pakistan: A Comparative Study of Baseline, Bayesian, and Metaheuristic Hyperparameter Optimization Techniques.优化巴基斯坦吉尔吉特-巴尔蒂斯坦喀喇昆仑公路沿线滑坡易发性制图的机器学习算法:基线、贝叶斯和元启发式超参数优化技术的比较研究
Sensors (Basel). 2023 Aug 1;23(15):6843. doi: 10.3390/s23156843.
8
Adaptive representation of molecules and materials in Bayesian optimization.贝叶斯优化中分子和材料的自适应表示
Chem Sci. 2025 Feb 19;16(13):5464-5474. doi: 10.1039/d5sc00200a. eCollection 2025 Mar 26.
9
Modern machine learning for tackling inverse problems in chemistry: molecular design to realization.用于解决化学逆问题的现代机器学习:从分子设计到实现
Chem Commun (Camb). 2022 Apr 28;58(35):5316-5331. doi: 10.1039/d1cc07035e.
10
Planning Implications Related to Sterilization-Sensitive Science Investigations Associated with Mars Sample Return (MSR).与火星样本返回(MSR)相关的对灭菌敏感的科学研究的规划意义。
Astrobiology. 2022 Jun;22(S1):S112-S164. doi: 10.1089/AST.2021.0113. Epub 2022 May 19.

引用本文的文献

1
Computed tomography-based radiomics model for predicting station 4 lymph node metastasis in non-small cell lung cancer.基于计算机断层扫描的放射组学模型预测非小细胞肺癌4区淋巴结转移
BMC Med Imaging. 2025 Jun 4;25(1):202. doi: 10.1186/s12880-025-01686-1.

本文引用的文献

1
Boosting Graph Neural Networks with Molecular Mechanics: A Case Study of Sigma Profile Prediction.用分子力学增强图神经网络:西格玛轮廓预测的案例研究
J Chem Theory Comput. 2023 Dec 26;19(24):9318-9328. doi: 10.1021/acs.jctc.3c01003. Epub 2023 Dec 8.
2
Application of interpretable machine learning models to improve the prediction performance of ionic liquids toxicity.可解释机器学习模型在提高离子液体毒性预测性能中的应用。
Sci Total Environ. 2024 Jan 15;908:168168. doi: 10.1016/j.scitotenv.2023.168168. Epub 2023 Oct 31.
3
Novel hybrid QSPR-GPR approach for modeling of carbon dioxide capture using deep eutectic solvents.
用于使用低共熔溶剂建模二氧化碳捕获的新型混合定量结构-性质关系-高斯过程回归方法。
RSC Adv. 2023 Oct 13;13(43):30071-30085. doi: 10.1039/d3ra05360a. eCollection 2023 Oct 11.
4
Exploring chemical compound space with quantum-based machine learning.利用基于量子的机器学习探索化合物空间。
Nat Rev Chem. 2020 Jul;4(7):347-358. doi: 10.1038/s41570-020-0189-9. Epub 2020 Jun 12.
5
Data-driven discovery of cardiolipin-selective small molecules by computational active learning.通过计算主动学习进行数据驱动的心磷脂选择性小分子发现。
Chem Sci. 2022 Mar 2;13(16):4498-4511. doi: 10.1039/d2sc00116k. eCollection 2022 Apr 20.
6
Sigma profiles in deep learning: towards a universal molecular descriptor.深度学习中的 Sigma 分布:迈向通用分子描述符
Chem Commun (Camb). 2022 May 5;58(37):5630-5633. doi: 10.1039/d2cc01549h.
7
Gaussian Process Regression for Materials and Molecules.用于材料和分子的高斯过程回归
Chem Rev. 2021 Aug 25;121(16):10073-10141. doi: 10.1021/acs.chemrev.1c00022. Epub 2021 Aug 16.
8
Could graph neural networks learn better molecular representation for drug discovery? A comparison study of descriptor-based and graph-based models.图神经网络能否为药物发现学习更好的分子表示?基于描述符和基于图的模型的比较研究。
J Cheminform. 2021 Feb 17;13(1):12. doi: 10.1186/s13321-020-00479-8.
9
Applications of Deep Learning in Molecule Generation and Molecular Property Prediction.深度学习在分子生成和分子性质预测中的应用。
Acc Chem Res. 2021 Jan 19;54(2):263-270. doi: 10.1021/acs.accounts.0c00699. Epub 2020 Dec 28.
10
Machine learning for molecular and materials science.机器学习在分子和材料科学中的应用。
Nature. 2018 Jul;559(7715):547-555. doi: 10.1038/s41586-018-0337-2. Epub 2018 Jul 25.