Suppr超能文献

分析用于性质预测的学习分子表示。

Analyzing Learned Molecular Representations for Property Prediction.

机构信息

Computer Science and Artificial Intelligence Laboratory , MIT , Cambridge , Massachusetts 02139 , United States.

Department of Chemical Engineering , MIT , Cambridge , Massachusetts 02139 , United States.

出版信息

J Chem Inf Model. 2019 Aug 26;59(8):3370-3388. doi: 10.1021/acs.jcim.9b00237. Epub 2019 Aug 13.

Abstract

Advancements in neural machinery have led to a wide range of algorithmic solutions for molecular property prediction. Two classes of models in particular have yielded promising results: neural networks applied to computed molecular fingerprints or expert-crafted descriptors and graph convolutional neural networks that construct a learned molecular representation by operating on the graph structure of the molecule. However, recent literature has yet to clearly determine which of these two methods is superior when generalizing to new chemical space. Furthermore, prior research has rarely examined these new models in industry research settings in comparison to existing employed models. In this paper, we benchmark models extensively on 19 public and 16 proprietary industrial data sets spanning a wide variety of chemical end points. In addition, we introduce a graph convolutional model that consistently matches or outperforms models using fixed molecular descriptors as well as previous graph neural architectures on both public and proprietary data sets. Our empirical findings indicate that while approaches based on these representations have yet to reach the level of experimental reproducibility, our proposed model nevertheless offers significant improvements over models currently used in industrial workflows.

摘要

神经机器的进步为分子性质预测带来了广泛的算法解决方案。特别是两类模型取得了有前景的结果:应用于计算分子指纹或专家设计描述符的神经网络,以及通过对分子图结构进行操作构建学习分子表示的图卷积神经网络。然而,最近的文献尚未明确确定这两种方法在推广到新的化学空间时哪种方法更优越。此外,与现有使用的模型相比,之前的研究很少在工业研究环境中检查这些新模型。在本文中,我们在 19 个公共数据集和 16 个专有工业数据集上对模型进行了广泛的基准测试,涵盖了广泛的化学终点。此外,我们引入了一个图卷积模型,该模型在公共和专有数据集上与使用固定分子描述符的模型以及以前的图神经网络体系结构一致匹配或表现优于这些模型。我们的实证研究结果表明,尽管基于这些表示的方法尚未达到实验可重复性的水平,但我们提出的模型仍然为工业工作流程中目前使用的模型提供了显著的改进。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9db2/6727618/7225f2416192/ci9b00237_0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验