基于文献训练的神经网络的无机材料合成规划。

Inorganic Materials Synthesis Planning with Literature-Trained Neural Networks.

机构信息

Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.

Department of EECS and CSAIL, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.

出版信息

J Chem Inf Model. 2020 Mar 23;60(3):1194-1201. doi: 10.1021/acs.jcim.9b00995. Epub 2020 Jan 28.

DOI:10.1021/acs.jcim.9b00995

PMID:31909619

Abstract

Leveraging new data sources is a key step in accelerating the pace of materials design and discovery. To complement the strides in synthesis planning driven by historical, experimental, and computed data, we present an automated, unsupervised method for connecting scientific literature to inorganic synthesis insights. Starting from the natural language text, we apply word embeddings from language models, which are fed into a named entity recognition model, upon which a conditional variational autoencoder is trained to generate syntheses for any inorganic materials of interest. We show the potential of this technique by predicting precursors for two perovskite materials, using only training data published over a decade prior to their first reported syntheses. We demonstrate that the model learns representations of materials corresponding to synthesis-related properties and that the model's behavior complements the existing thermodynamic knowledge. Finally, we apply the model to perform synthesizability screening for proposed novel perovskite compounds.

摘要

利用新的数据源是加速材料设计和发现步伐的关键步骤。为了补充由历史、实验和计算数据驱动的合成规划方面的进展，我们提出了一种自动化、无监督的方法，将科学文献与无机合成见解联系起来。从自然语言文本开始，我们应用语言模型的词嵌入，将其输入到命名实体识别模型中，然后对条件变分自动编码器进行训练，以便为任何感兴趣的无机材料生成合成方案。我们仅使用在首次报道合成之前十年内发布的训练数据，通过预测两种钙钛矿材料的前体，展示了该技术的潜力。我们证明了该模型学习了与合成相关的材料表示，并且模型的行为补充了现有的热力学知识。最后，我们应用该模型对提出的新型钙钛矿化合物进行可合成性筛选。

相似文献

Inorganic Materials Synthesis Planning with Literature-Trained Neural Networks.

J Chem Inf Model. 2020 Mar 23;60(3):1194-1201. doi: 10.1021/acs.jcim.9b00995. Epub 2020 Jan 28.

Combine Factual Medical Knowledge and Distributed Word Representation to Improve Clinical Named Entity Recognition.

AMIA Annu Symp Proc. 2018 Dec 5;2018:1110-1117. eCollection 2018.

A comparison of word embeddings for the biomedical natural language processing.

J Biomed Inform. 2018 Nov;87:12-20. doi: 10.1016/j.jbi.2018.09.008. Epub 2018 Sep 12.

Clinical text classification with rule-based features and knowledge-guided convolutional neural networks.

BMC Med Inform Decis Mak. 2019 Apr 4;19(Suppl 3):71. doi: 10.1186/s12911-019-0781-4.

deepBioWSD: effective deep neural word sense disambiguation of biomedical text data.

J Am Med Inform Assoc. 2019 May 1;26(5):438-446. doi: 10.1093/jamia/ocy189.

Automatic extraction of cancer registry reportable information from free-text pathology reports using multitask convolutional neural networks.

J Am Med Inform Assoc. 2020 Jan 1;27(1):89-98. doi: 10.1093/jamia/ocz153.

Multiple Embeddings Enhanced Multi-Graph Neural Networks for Chinese Healthcare Named Entity Recognition.

IEEE J Biomed Health Inform. 2021 Jul;25(7):2801-2810. doi: 10.1109/JBHI.2020.3048700. Epub 2021 Jul 27.

Bio-AnswerFinder: a system to find answers to questions from biomedical texts.

Database (Oxford). 2020 Jan 1;2020. doi: 10.1093/database/baz137.

Adverse drug event and medication extraction in electronic health records via a cascading architecture with different sequence labeling models and word embeddings.

J Am Med Inform Assoc. 2020 Jan 1;27(1):47-55. doi: 10.1093/jamia/ocz120.

Recent advances in Swedish and Spanish medical entity recognition in clinical texts using deep neural approaches.

BMC Med Inform Decis Mak. 2019 Dec 23;19(Suppl 7):274. doi: 10.1186/s12911-019-0981-y.

引用本文的文献

Materials informatics for developing new restorative dental materials: a narrative review.

Front Dent Med. 2023 Jan 26;4:1123976. doi: 10.3389/fdmed.2023.1123976. eCollection 2023.

A materials terminology knowledge graph automatically constructed from text corpus.

Sci Data. 2024 Jun 7;11(1):600. doi: 10.1038/s41597-024-03448-0.

ZeoSyn: A Comprehensive Zeolite Synthesis Dataset Enabling Machine-Learning Rationalization of Hydrothermal Parameters.

ACS Cent Sci. 2024 Mar 6;10(3):729-743. doi: 10.1021/acscentsci.3c01615. eCollection 2024 Mar 27.

Extracting accurate materials data from research papers with conversational language models and prompt engineering.

Nat Commun. 2024 Feb 21;15(1):1569. doi: 10.1038/s41467-024-45914-8.

Artificial Intelligence-Powered Electronic Skin.

Nat Mach Intell. 2023 Dec;5(12):1344-1355. doi: 10.1038/s42256-023-00760-z. Epub 2023 Dec 18.

Text Mining the Literature to Inform Experiments and Rationalize Impurity Phase Formation for BiFeO.

Chem Mater. 2023 Dec 29;36(2):772-785. doi: 10.1021/acs.chemmater.3c02203. eCollection 2024 Jan 23.

Predicting synthesis recipes of inorganic crystal materials using elementwise template formulation.

Chem Sci. 2023 Dec 8;15(3):1039-1045. doi: 10.1039/d3sc03538g. eCollection 2024 Jan 17.

Precursor recommendation for inorganic synthesis by machine learning materials similarity from scientific literature.

Sci Adv. 2023 Jun 9;9(23):eadg8180. doi: 10.1126/sciadv.adg8180.

A corpus of CO electrocatalytic reduction process extracted from the scientific literature.

Sci Data. 2023 Mar 29;10(1):175. doi: 10.1038/s41597-023-02089-z.

Machine learning for a sustainable energy future.

Nat Rev Mater. 2023;8(3):202-215. doi: 10.1038/s41578-022-00490-5. Epub 2022 Oct 18.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于文献训练的神经网络的无机材料合成规划。

Inorganic Materials Synthesis Planning with Literature-Trained Neural Networks.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献