分析用于性质预测的学习分子表示。

Analyzing Learned Molecular Representations for Property Prediction.

机构信息

Computer Science and Artificial Intelligence Laboratory , MIT , Cambridge , Massachusetts 02139 , United States.

Department of Chemical Engineering , MIT , Cambridge , Massachusetts 02139 , United States.

出版信息

J Chem Inf Model. 2019 Aug 26;59(8):3370-3388. doi: 10.1021/acs.jcim.9b00237. Epub 2019 Aug 13.

DOI:10.1021/acs.jcim.9b00237

PMID:31361484

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6727618/

Abstract

Advancements in neural machinery have led to a wide range of algorithmic solutions for molecular property prediction. Two classes of models in particular have yielded promising results: neural networks applied to computed molecular fingerprints or expert-crafted descriptors and graph convolutional neural networks that construct a learned molecular representation by operating on the graph structure of the molecule. However, recent literature has yet to clearly determine which of these two methods is superior when generalizing to new chemical space. Furthermore, prior research has rarely examined these new models in industry research settings in comparison to existing employed models. In this paper, we benchmark models extensively on 19 public and 16 proprietary industrial data sets spanning a wide variety of chemical end points. In addition, we introduce a graph convolutional model that consistently matches or outperforms models using fixed molecular descriptors as well as previous graph neural architectures on both public and proprietary data sets. Our empirical findings indicate that while approaches based on these representations have yet to reach the level of experimental reproducibility, our proposed model nevertheless offers significant improvements over models currently used in industrial workflows.

摘要

神经机器的进步为分子性质预测带来了广泛的算法解决方案。特别是两类模型取得了有前景的结果：应用于计算分子指纹或专家设计描述符的神经网络，以及通过对分子图结构进行操作构建学习分子表示的图卷积神经网络。然而，最近的文献尚未明确确定这两种方法在推广到新的化学空间时哪种方法更优越。此外，与现有使用的模型相比，之前的研究很少在工业研究环境中检查这些新模型。在本文中，我们在 19 个公共数据集和 16 个专有工业数据集上对模型进行了广泛的基准测试，涵盖了广泛的化学终点。此外，我们引入了一个图卷积模型，该模型在公共和专有数据集上与使用固定分子描述符的模型以及以前的图神经网络体系结构一致匹配或表现优于这些模型。我们的实证研究结果表明，尽管基于这些表示的方法尚未达到实验可重复性的水平，但我们提出的模型仍然为工业工作流程中目前使用的模型提供了显著的改进。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9db2/6727618/7225f2416192/ci9b00237_0001.jpg

相似文献

Analyzing Learned Molecular Representations for Property Prediction.

J Chem Inf Model. 2019 Aug 26;59(8):3370-3388. doi: 10.1021/acs.jcim.9b00237. Epub 2019 Aug 13.

Dual graph convolutional neural network for predicting chemical networks.

BMC Bioinformatics. 2020 Apr 23;21(Suppl 3):94. doi: 10.1186/s12859-020-3378-0.

Molecule Property Prediction Based on Spatial Graph Embedding.

J Chem Inf Model. 2019 Sep 23;59(9):3817-3828. doi: 10.1021/acs.jcim.9b00410. Epub 2019 Aug 30.

Machine learning enabled identification of potential SARS-CoV-2 3CLpro inhibitors based on fixed molecular fingerprints and Graph-CNN neural representations.

J Biomed Inform. 2021 Jul;119:103821. doi: 10.1016/j.jbi.2021.103821. Epub 2021 May 28.

Importance of Engineered and Learned Molecular Representations in Predicting Organic Reactivity, Selectivity, and Chemical Properties.

Acc Chem Res. 2021 Feb 16;54(4):827-836. doi: 10.1021/acs.accounts.0c00745. Epub 2021 Feb 3.

Improving Crystal Property Prediction from a Multiplex Graph Perspective.

J Chem Inf Model. 2024 Oct 14;64(19):7376-7385. doi: 10.1021/acs.jcim.4c01200. Epub 2024 Oct 3.

Employing Molecular Conformations for Ligand-Based Virtual Screening with Equivariant Graph Neural Network and Deep Multiple Instance Learning.

Molecules. 2023 Aug 9;28(16):5982. doi: 10.3390/molecules28165982.

Machine Learning of Reaction Properties via Learned Representations of the Condensed Graph of Reaction.

J Chem Inf Model. 2022 May 9;62(9):2101-2110. doi: 10.1021/acs.jcim.1c00975. Epub 2021 Nov 4.

A novel interactive deep cascade spectral graph convolutional network with multi-relational graphs for disease prediction.

Neural Netw. 2024 Jul;175:106285. doi: 10.1016/j.neunet.2024.106285. Epub 2024 Apr 1.

Augmented Graph Neural Network with hierarchical global-based residual connections.

Neural Netw. 2022 Jun;150:149-166. doi: 10.1016/j.neunet.2022.03.008. Epub 2022 Mar 10.

引用本文的文献

Enhancing molecular representation via fusion of multimodal transformers with integrated periodic local and global features.

J Comput Aided Mol Des. 2025 Sep 13;39(1):77. doi: 10.1007/s10822-025-00658-5.

Predicting reaction conditions: a data-driven perspective.

Chem Sci. 2025 Aug 6. doi: 10.1039/d5sc03045e.

A multimodal contrastive learning framework for predicting P-glycoprotein substrates and inhibitors.

J Pharm Anal. 2025 Aug;15(8):101313. doi: 10.1016/j.jpha.2025.101313. Epub 2025 Apr 16.

MolMod: a molecular modification platform for molecular property optimization via fragment-based generation.

Mol Divers. 2025 Sep 4. doi: 10.1007/s11030-025-11342-z.

The first South Korean data challenge for drug discovery using human and mouse liver microsomal stability data.

J Cheminform. 2025 Sep 3;17(1):139. doi: 10.1186/s13321-025-01047-8.

Systematic benchmarking of 13 AI methods for predicting cyclic peptide membrane permeability.

J Cheminform. 2025 Aug 28;17(1):129. doi: 10.1186/s13321-025-01083-4.

AdapTor: Adaptive Topological Regression for quantitative structure-activity relationship modeling.

J Cheminform. 2025 Aug 28;17(1):128. doi: 10.1186/s13321-025-01071-8.

Harnessing AI-driven reverse docking in drug discovery: a comprehensive review of opportunities, challenges, and emerging trends.

J Mol Model. 2025 Aug 25;31(9):256. doi: 10.1007/s00894-025-06480-y.

Reusability Report: evaluating the performance of a meta-learning foundation model on predicting the antibacterial activity of natural products.

Res Sq. 2025 Aug 12:rs.3.rs-6932613. doi: 10.21203/rs.3.rs-6932613/v1.

A message passing framework for precise cell state identification with scClassify2.

Genome Biol. 2025 Aug 19;26(1):252. doi: 10.1186/s13059-025-03722-3.

本文引用的文献

Chemi-Net: A Molecular Graph Convolutional Network for Accurate Drug Property Prediction.

Int J Mol Sci. 2019 Jul 10;20(14):3389. doi: 10.3390/ijms20143389.

Ligand biological activity predicted by cleaning positive and negative chemical correlations.

Proc Natl Acad Sci U S A. 2019 Feb 26;116(9):3373-3378. doi: 10.1073/pnas.1810847116. Epub 2019 Feb 11.

PotentialNet for Molecular Property Prediction.

ACS Cent Sci. 2018 Nov 28;4(11):1520-1530. doi: 10.1021/acscentsci.8b00507. Epub 2018 Nov 2.

Large-scale comparison of machine learning methods for drug target prediction on ChEMBL.

Chem Sci. 2018 Jun 6;9(24):5441-5451. doi: 10.1039/c8sc00148k. eCollection 2018 Jun 28.

MoleculeNet: a benchmark for molecular machine learning.

Chem Sci. 2017 Oct 31;9(2):513-530. doi: 10.1039/c7sc02664a. eCollection 2018 Jan 14.

Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules.

ACS Cent Sci. 2018 Feb 28;4(2):268-276. doi: 10.1021/acscentsci.7b00572. Epub 2018 Jan 12.

Mordred: a molecular descriptor calculator.

J Cheminform. 2018 Feb 6;10(1):4. doi: 10.1186/s13321-018-0258-y.

Prediction Errors of Molecular Machine Learning Models Lower than Hybrid DFT Error.

J Chem Theory Comput. 2017 Nov 14;13(11):5255-5264. doi: 10.1021/acs.jctc.7b00577. Epub 2017 Oct 10.

Convolutional Embedding of Attributed Molecular Graphs for Physical Property Prediction.

J Chem Inf Model. 2017 Aug 28;57(8):1757-1772. doi: 10.1021/acs.jcim.6b00601. Epub 2017 Jul 25.

Is Multitask Deep Learning Practical for Pharma?

J Chem Inf Model. 2017 Aug 28;57(8):2068-2076. doi: 10.1021/acs.jcim.7b00146. Epub 2017 Aug 1.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

分析用于性质预测的学习分子表示。

Analyzing Learned Molecular Representations for Property Prediction.

机构信息

Computer Science and Artificial Intelligence Laboratory , MIT , Cambridge , Massachusetts 02139 , United States.

Department of Chemical Engineering , MIT , Cambridge , Massachusetts 02139 , United States.

出版信息

J Chem Inf Model. 2019 Aug 26;59(8):3370-3388. doi: 10.1021/acs.jcim.9b00237. Epub 2019 Aug 13.

DOI:10.1021/acs.jcim.9b00237

PMID:31361484

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6727618/

Abstract

摘要

分析用于性质预测的学习分子表示。

Analyzing Learned Molecular Representations for Property Prediction.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

分析用于性质预测的学习分子表示。

Analyzing Learned Molecular Representations for Property Prediction.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献